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DESCRIPTION 

TOXJ NS ACT IVE A GAIN S T PES T? 

Cross-Reference to a Related Application 
This application is a continuation-in-part of application Serial No. 08/674,002, filed July 

1, 1996. 

Background of the Invention 
The soil microbe Bacillus thuringiensis (B.t.) is a Gram-positive, spore-forming 
bacterium. Most strains of B.t. do not exhibit pesticidal activity. Some B.t. strains produce, and 
can be characterized by, parasporal crystalline protein inclusions. These 4 '6-endotoxins M are 
different from exotoxins, which have a non-specific host range. These inclusions often appear 
microscopically as distinctively shaped crystals. The proteins can be highly toxic to pests and 
specific in their toxic activity. Certain B.t. toxin genes have been isolated and sequenced, and 
recombinant DNA-based Ba. products have been produced and approved for use. In addition, 
with the use of genetic engineering techniques, new approaches for delivering Ba. toxins to 
agricultural environments are under development, including the use of plants genetically 
engineered with B.t. toxin genes for insect resistance and the use of stabilized intact microbial 
cells as Ba. toxin delivery vehicles (Gaertner, F.H, L. Kim [1988] TIBTECH 6:S4-S7). Thus, 
isolated B.t. endotoxin genes are becoming commercially valuable. 

Until the last fifteen years, commercial use of Ba. pesticides has been largely restricted 
to a narrow range of lepidopteran (caterpillar) pests. Preparations of the spores and crystals of 
B. thuringiensis subsp. kurstaki have been used for many years as commercial insecticides for 
lepidopteran pests. For example, B. thuringiensis var. kurstaki HD-1 produces a crystalline 6- 
endotoxin which is toxic to the larvae of a number of lepidopteran insects. 

In recent years, however, investigators have discovered Ba. pesticides with specificities 
for a much broader range of pests. For example, other species of B.t. , namely israelensis and 
morrisoni (a.k.a. tenebrionis, a.k.a. Ba. M-7, a.k.a. B.t. san diego), have been used commercially 
to control insects of the orders Diptera and Coleoptera, respectively (Gaertner, F.H. [1989] 
"Cellular Delivery Systems for Insecticidal Proteins: Living and Non-Living Microorganisms," 
in Controlled Delivery of Crop Protection Agents, R.M. Wilkins, ed., Taylor and Francis, New 
York and London, 1990, pp. 245-255.). See also Couch, T.L. (1980) "Mosquito Pathogenicity 
of Bacillus thuringiensis var. israelensis," Developments in Industrial Microbiology 22:61-76; 
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and Beegle, C.C. (1978) "Use of Entomogenous Bacteria in Agroecosystems," Developments 
in Industrial Microbiology 20:97-104. Kneg, A., A.M. Huger, G.A. Langenbrxich, W. 
Schnetter (1 983) Z ang. Ent. 96:500-508 describe Bacillus thuringiensis var. tenebrionis, which 
is reportedly active against two beetles in the order Coleoptera. These are the Colorado potato 
beetle, Leptinotarsa decemlineata, and Agelastica alni. 

Recently, new subspecies of B.t. have been identified, and genes responsible for active 
6-endotoxin proteins have been isolated (Hofte, H. f H.R. Whiteley [1989] Microbiological 
Reviews 52(2):242-255). H6fte and Whiteley classified B.t. crystal protein genes into four major 
classes. The classes were Cryl (Lepidoptera-specific), Cryll (Lepidoptera- and Diptera-specific), 
Crylll (Coleoptera-specific), and CrylV (Diptera-specific). The discovery of strains specifically 
toxic to other pests has been reported (Feitelson, J.S., J. Payne, L. Kim [1992] Bio/Technology 
10:271-275). CryV has been proposed to designate a class of toxin genes that are nematode- 
specific. Lambert el al (Lambert, B., L. Buysse, C. Decock, S. Jansens, C. Piens, B. Saey, J. 
Seunnck, K. van Audenhove, J. Van Rie, A. Van Vhet, M. Peferoen [1996] Appi Environ. 
Microbiol 62(l):80-86) and Shevelev el al ([1993] FEBS Lett. 336:79-82) describe the 
characterization of Cry9 toxins active against lepidopterans. Published PCT applications WO 
94/05771 and WO 94/24264 also describe B.t. isolates active against lepidopteran pests. Gleave 
et al. ([1991] JGM 138:55-62) and Smulevitch et al ([1991] FEBS Lett. 293:25-26) also 
describe B.t. toxins. A number of other classes of B.t. genes have now been identified. 

The cloning and expression of a B.t. crystal protein gene in Escherichia coli has been 
described in the published literature (Schnepf, H.E., H.R. Whiteley [1981] Proc. Natl Acad. Sci. 
USA 78:2893-2897.). U.S. Patent 4,448,885 and U.S. Patent 4,467,036 both disclose the 
expression of B.t. crystal protein in E. coli. U.S. Patents 4,990,332; 5,039,523; 5,126,133; 
5,164,180; and 5,169,629 are among those which disclose B.t. toxins having activity against 
lepidopterans. PCT application WO96/05314 discloses PS86W1, PS86V1, and other B.t. 
isolates active against lepidopteran pests. The PCT patent applications published as 
W094/24264 and WO94/05771 describe B.t. isolates and toxins active against lepidopteran 
pests. B.t. proteins with activity against members of the family Noctuidae are described by 
Lambert et al, supra. U.S. Patents 4,797,276 and 4,853,331 disclose B. thuringiensis strain 
tenebrionis which can be used to control coleopteran pests in various environments. U.S. Patent 
No. 4,918,006 discloses B.t. toxins having activity against dipterans. U.S. Patent No. 5,151,363 
and U.S. Patent No. 4,948,734 disclose certain isolates of B.t. which have activity against 
nematodes. Other U.S. patents which disclose activity against nematodes include 5,093,120; 
5,236,843; 5,262,399; 5,270,448; 5,281,530; 5,322,932; 5,350,577; 5,426,049; and 5,439,881 . 
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As a result of extensive research and investment of resources, other patents have issued for new 
B.t. isolates and new uses of B.t. isolates. See Feitelson el ai, supra, for a review. However, 
the discovery of new B.t. isolates and new uses of known B.t. isolates remains an empirical, 
unpredictable art. 

Isolating responsible toxin genes has been a slow empirical process. Carozzi et ai 
(Carozzi, N.B., V.C. Kramer, G.W. Warren, S. Evola, G. Koziel (\99\) Appl. Env. Microbiol 
57( 11 ):3057-3061) describe methods for identifying nove B.t. isolates. This report does not 
disclose or suggest the specific primers, probes, toxins, and genes of the subject invention for 
lepidopteran-active toxin genes. U.S. Patent No. 5,204,237 describes specific and universal 
probes for the isolation oiB.t. toxin genes. This patent, however, does not describe the probes, 
primers, toxins, and genes of the subject invention. 

WO 94/21795 and Estruch, JJ. et ai ([1996] PNAS 93:5389-5394) describe toxins 
obtained from Bacillus microbes. These toxins are reported to be produced during vegetative 
cell growth and were thus termed vegetative insccticidal proteins (VIP). These toxins were 
reported to be distinct from crystal -forming 6-endotoxins. Activity of these toxins against 
leptdopteran pests was reported. 

Black cutworm (Agrotis ipsilon (Hufnagel); Lepidoptera: Nocruidae) is a serious pest 
of many crops including maize, cotton, cole crops (Brassica, broccoli, cabbages, Chinese 
cabbages), and turf. Secondary host plants include beetroots, Capsicum (peppers), chickpeas, 
faba beans, lettuces, lucerne, onions, potatoes, radishes, rape (canola), rice, soybeans, 
strawberries, sugarbeet, tobacco, tomatoes, and forest frees. In North America, pests of the 
genus Agrotis feed on clover, corn, tobacco, hemp, onion, strawberries, blackberries, raspberries, 
alfalfa, barley, beans, cabbage, oats, peas, potatoes, sweetpotatoes, tomato, garden flowers, 
grasses, lucerne, maize, asparagus, grapes, almost any kind of leaf, weeds, and many other crops 
and garden plants. Other cutworms in the Tribe Agrotini are pests, in particular those in the 
genus Feltia {e.g., FJaculifera (Guenee); equivalent to ducens subgolhica) and Euxoa {e.g., E. 
messoria (Harris), E. scandens (Riley), E. auxiliaris Smith, E. detersa (Walker), E. tessellata 
(Harris), E. ochrogaster (Guenee). Host plants include various crops, including rape. 

Cutworms are also pests outside North America, and the more economically significant 
pests attack chickpeas, wheat, vegetables, sugarbeet, luceme, maize, potatoes, turnips, rape, 
lettuces, strawberries, loganberries, flax, cotton, soybeans, tobacco, beetroots, Chinese cabbages, 
tomatoes, aubergines, sugarcane, pastures, cabbages, groundnuts, Cucurbita, turnips, sunflowers, 
Brassica, onions, leeks, celery, sesame, asparagus, rhubarb, chicory, greenhouse crops, and 
spinach. The black cutworm A. ipsilon occurs as a pest outside North America, including 
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Central America, Europe, Asia, Australasia, Africa, India, Taiwan, Mexico, Egypt, and New 
Zealand. 

Cutworms progress through several instars as larvae. Although seedling cutting by later 
instar larvae produces the most obvious damage. and economic loss, leaf feeding commonly 
results in yield loss in crops such as maize. Upon reaching the fourth larval instar, larvae begin 
to cut plants and plant parts, especially seedlings. Because of the shift in feeding behavior, 
economically damaging populations may build up unexpectedly with few early warning signs. 
Their nocturnal habit and behavior of burrowing into the ground also makes detection 
problematic. Large cutworms can destroy several seedlings per day, and a heavy infestation can 
remove entire stands of crops. 

Cultural controls (or A. ipsilon such as peripheral weed control can help prevent heavy 
infestations; however, such methods are not always feasible or effective. Infestations are very 
sporadic, and applying an insecticide prior to planting or at planting has not been effective in the 
past. Some baits are available for control of cutworms in crops. To protect turfgrass such as 
creeping bentgrass, chemical insecticides have been employed. Use of chemical pesticides is 
a particular concern in turf because of the close contact the public has with treated areas (e.g., 
golf greens, athletic fields, parks and other recreational areas, professional landscaping, home 
lawns). Natural products {e.g., nematodes, azadirachtin) generally perform poorly. To date, 
Bacillus thuringiensis products have not been widely used to control black cutworm because 
highly effective toxins have not been available. 

grief Summary of (he Invention 

The subject invention concerns materials and methods useful in the control of non- 
mammalian pests and, particularly, plant pests. In a specific embodiment, the subject invention 
provides new toxins useful for the control of lepidopterans. in a particularly preferred 
embodiment, the toxins of the subject invention are used to control black cutworm, the subject 
invention further provides nucleotide sequences which encode the lepidopteran-active toxins of 
the subject invention. The subject invention further provides nucleotide sequences and methods 
useful in the identification and characterization of genes which encode pesticidal toxins. The 
subject invention further provides new Bacillus thuringiensis isolates having pesticidal activities. 

In one embodiment, the subject invention concerns unique nucleotide sequences which 
are useful as primers in PCR techniques. The primers produce characteristic gene fragments 
which can be used in the identification and isolation of specific toxin genes. The nucleotide 
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sequences of the subject invention encode toxins which are distinct from previously-described 
6-endotoxins. 

In one embodiment of the subject invention, B.t. isolates can be cultivated under 
conditions resulting in high multiplication of the microbe. After treating the microbe to provide 
single-stranded genomic nucleic acid, the DNA can be contacted with the primers of the 
invention and subjected to PCR amplification. Characteristic fragments of toxin-encoding genes 
will be amplified by the procedure, thus identifying the presence of the toxin-encoding gene(s). 

A further aspect of the subject invention is the use of the disclosed nucleotide sequences 
as probes to detect, identify, and characterize genes encoding B.t. toxins which are active against 
lepidopterans. 

Further aspects of the subject invention include the genes and isolates identified using 
the methods and nucleotide sequences disclosed herein. The genes thus identified encode toxins 
active against lepidopterans. Similarly, the isolates will have activity against these pests. 

New pesticidal B.t. isolates of the subject invention include PS31G1, PS185U2, PS1 IB, 
PS218G2, PS213E5, PS28C, PS86BB1, PS89J3, PS94R1, PS27J2, PS101DD, and PS202S. 

As described herein, the toxins useful according to the subject invention may be 
chimeric toxins produced by combining portions of multiple toxins. 

In a preferred embodiment, the subject invention concerns plants cells transformed with 
at least one polynucleotide sequence of the subject invention such that the transformed plant 
cells express pesticidal toxins in tissues consumed by the target pests. Such transformation of 
plants can be accomplished using techniques well known to those skilled in the art and would 
typically involve modification of the gene to optimize expression of the toxin in plants. 

Alternatively, the B.t. isolates of the subject invention, or recombinant microbes 
expressing the toxins described herein, can be used to control pests. In this regard, the invention 
includes the treatment of substantially intact B.t. cells, and/or recombinant cells containing the 
expressed toxins of the invention, treated to prolong the pesticidal activity when the substantially 
intact cells are applied to the environment of a target pest. The treated cell acts as a protective 
coating for the pesticidal toxin. The toxin becomes active upon ingestion by a target insect. 



Brief Description of the Sequences 
SEQ ID NO. 1 is a forward primer useful according to the subject invention. 
SEQ ID NO. 2 is a reverse primer useful according to the subject invention. 
SEQ ID NO. 3 is a forward primer useful according to the subject invention. 
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SEQ ID NO. 4 is a reverse primer useful according to the subject invention. 

SEQ ID NO. 5 is a forward primer useful according to the subject invention. 

SEQ ID NO. 6 is a reverse primer useful according to the subject invention. 

SEQ ID NO. 7 is an ammo acid sequence of the toxin designated 1 1B1AR. 

SEQ ID NO. 8 is a nucleotide sequence encoding an amino acid sequence of toxin 
11BIAR(SEQ ID NO. 7). 

SEQ ID NO. 9 is an amino acid sequence of the toxin designated 1 1B1BR. 

SEQ ID NO. 10 is a nucleotide sequence encoding an amino acid sequence of toxin 
J IB 1 BR (SEQ ID NO. 9). 

SEQ ID NO. 11 is an amino acid sequence of the toxin designated 1 291 A. 

SEQ ID NO. 12 is a nucleotide sequence encoding an amino acid sequence of toxin 
1291 A (SEQ ID NO. 11). 

SEQ ID NO. 13 is an amino acid sequence of the toxin designated 1292A. 

SEQ ID NO. 14 is a nucleotide sequence encoding an ammo acid sequence of toxin 
1292A(SEQIDNO. 13). 

SEQ ID NO. 15 is an amino acid sequence of the toxin designated 1292B. 

SEQ ID NO. 16 is a nucleotide sequence encoding an amino acid sequence of toxin 
1292B (SEQ ID NO. 15). 

SEQ ID NO. 17 is an amino acid sequence of the toxin designated 3 1GA. 

SEQ ID NO. 18 is a nucleotide sequence encoding an amino acid sequence of toxin 
31GA(SEQIDNO. 17). 

SEQ ID NO. 19 is an amino acid sequence of the toxin designated 3 1 GBR. 

SEQ ID NO. 20 is a nucleotide sequence encoding an amino acid sequence of toxin 
31 GBR (SEQ ID NO. 19). 

SEQ ED NO. 21 is an amino acid sequence of the toxin designated 85N 1 R identified by 
the method of the subject invention. 

SEQ ID NO. 22 is a nucleotide sequence encoding an amino acid sequence of toxin 
85N1R(SEQ ID NO. 21). 

SEQ ID NO. 23 is an amino acid sequence of the toxin designated 85N2. 
SEQ ID NO. 24 is a nucleotide sequence encoding an amino acid sequence of toxm 
85N2 (SEQ ID NO. 23). 

SEQ ID NO. 25 is an amino acid sequence of the toxin designated 85N3. 
SEQ ID NO. 26 is a nucleotide sequence encoding an amino acid sequence of toxin 
85N3 (SEQ ID NO. 25). 
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SEQ ID NO. 27 is an amino acid sequence of the toxin designated 86V1C1. 
SEQ ID NO. 28 is a nucleotide sequence encoding an amino acid sequence of toxin 
86V1C1 (SEQ ID NO. 27). 

SEQ ID NO. 29 is an amino acid sequence of the toxin designated 86V1C2. 
SEQ ID NO. 30 is a nucleotide sequence encoding an ammo acid sequence of toxin 
86V 1C2 (SEQ ID NO. 29). 

SEQ ID NO. 31 is an amino acid sequence of the toxin designated 86V1C3R. 
SEQ ID NO. 32 is a nucleotide sequence encoding an amino acid sequence of toxin 
86V1C3R(SEQ ID NO. 31). 

SEQ ID NO. 33 is an amino acid sequence of the toxin designated F525A. 
SEQ ID NO. 34 is a nucleotide sequence encoding an amino acid sequence of toxin 
F252A (SEQ ID NO. 33). 

SEQ ID NO. 35 is an amino acid sequence of the toxin designated F525B. 
SEQ ID NO. 36 is a nucleotide sequence encoding an amino acid sequence of toxin 
F525B (SEQ ID NO. 35). 

SEQ ID NO. 37 is an amino acid sequence of the toxin desjgnated F525C. 
SEQ ID NO. 38 is a nucleotide sequence encoding an amino acid sequence of toxin 
F525C(SEQ ID NO. 37). 

SEQ ID NO. 39 is an amino acid sequence of the toxin designated F573A. 
SEQ ID NO. 40 is a nucleotide sequence encoding an amino acid sequence of toxin 
F573A(SEQ ID NO. 39). 

SEQ ID NO. 41 is an amino acid sequence of the toxin designated F573B. 
SEQ ID NO. 42 is a nucleotide sequence encoding an amino acid sequence of toxin 
F573B(SEQ ID NO. 41). 

SEQ ID NO. 43 is an amino acid sequence of the toxin designated F573C. 
SEQ ID NO. 44 is a nucleotide sequence encoding an amino acid sequence of toxin 
F573C (SEQ ID NO. 43). 

SEQ ID NO. 45 is an amino acid sequence of the toxin designated FBB1 A. 
SEQ ID NO. 46 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBB1 A (SEQ ID NO. 45). 

SEQ ID NO. 47 is an ammo acid sequence of the toxin designated FBB1BR. 
SEQ ID NO. 48 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBB1BR (SEQ ID NO. 47). 

SEQ ID NO. 49 is an amino acid sequence of the toxin designated FBB1C. 
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SEQ ID NO. 50 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBB1C (SEQ ID NO. 49). 

SEQ ID NO. 51 is an amino acid sequence of the toxin designated FBB1D. 
SEQ ID NO. 52 is a nucleotide sequence encoding an amino acid sequence of toxin 
FBBID (SEQ ID NO. 51). 

SEQ ID NO. 53 is an amino acid sequence of the toxin designated J3 1 AR. 
SEQ ID NO. 54 is a nucleotide sequence encoding an amino acid sequence of toxin 
J31AR(SEQIDN0.53). 

SEQ ID NO. 55 is an amino acid sequence of the toxin designated J32AR. 
SEQ ID NO. 56 is a nucleotide sequence encoding an amino acid sequence of toxin 
J32AR (SEQ ID NO. 55). 

SEQ ID NO. 57 is an amino acid sequence of the toxin designated W1FAR. 
SEQ ID NO. 58 is a nucleotide sequence encoding an amino acid sequence of toxin 
WIFAR (SEQ rDNO. 57). 

SEQ ID NO. 59 is an amino acid sequence of the toxin designated W1FBR. 
SEQ ID NO. 60 is a nucleotide sequence encoding an amino acid sequence of toxin 
W1FBR (SEQ ID NO. 59). 

SEQ ID NO. 61 is an amino acid sequence of the toxin designated W1FC. 
SEQ ID NO. 62 is a nucleotide sequence encoding an amino acid sequence of toxin 
WlFC(SEQrDN0.61). 

SEQ ID NO. 63 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 64 is an oligonucleotide useful as a PCR pnmer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 65 is an oligonucleotide useful as a PCR pnmer or hybridization probe 
according to the subject invention. 

SEQ ED NO. 66 is an oligonucleotide useful as a PCR pnmer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 67 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 68 is an oligonucleotide useful as a PCR primer or hybridization probe 
according to the subject invention. 

SEQ ID NO. 69 is an oligonucleotide useful as a PCR primer or hybndization probe 
according to the subject invention. 
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SEQ ID NO. 70 is an amino acid sequence of the toxin designated 86BBl(a). 
SEQ ID NO. 71 is a nucleotide sequence encoding an amino acid sequence of toxin 
86BBl(a). 

SEQ ID NO. 72 is an amino acid sequence of the toxin designated 86BBl(b). 
SEQ ID NO. 73 is a nucleotide sequence encoding an amino acid sequence of toxin 
86BBl(b). 

SEQ IB NO. 74 is an amino acid sequence of the toxin designated 31Gl(a). 
SEQ ID NO. 75 is a nucleotide sequence encoding an ammo acid sequence of toxin 
31Gl(a). 

SEQ ID NO. 76 is an amino acid sequence of the toxin designated 129HD chimeric. 
SEQ ID NO. 77 is a nucleotide sequence encoding an amino acid sequence of toxin 
129HD chimeric. 

SEQ ID NO. 78 is an amino acid sequence of the toxin designated 1 IB(a). 

SEQ ID NO. 79 is a nucleotide sequence encoding an amino acid sequence of toxin 

HB(a). 

SEQ ID NO. 80 is an amino acid sequence of the toxin designated 3 \ G 1 (b). 
SEQ ID NO. 81 is a nucleotide sequence encoding an ammo acid sequence of toxin 
3lGl(b). 

SEQ ID NO. 82 is an amino acid sequence of the toxin designated 86BBl(c). 
SEQ ID NO. 83 is a nucleotide sequence encoding an amino acid sequence of toxin 
86BBl(c). 

SEQ ID NO. 84 is an amino acid sequence of the toxin designated 86V 1(a). 
SEQ ID NO. 85 is a nucleotide sequence encoding an amino acid sequence of toxin 
86 V 1(a). 

SEQ ID NO. 86 is an amino acid sequence of the toxin designated 86Wl(a). 
SEQ ID NO. 87 is a nucleotide sequence encoding an amino acid sequence of toxin 
86Wl(a). 

SEQ ID NO. 88 is a partial amino acid sequence of the toxin designated 94Rl(a). 
SEQ ID NO. 89 is a partial nucleotide sequence encoding an amino acid sequence of 
toxin 94R 1(a). 

SEQ ID NO. 90 is an amino acid sequence of the toxin designated 185U2(a). 

SEQ ID NO. 91 is a nucleotide sequence encoding an amino acid sequence of toxin 

185U2(a). 

SEQ ID NO. 92 is an amino acid sequence of the toxin designated 202S(a). 
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SEQ ID NO. 93 is a nucleotide sequence encoding an amino acid sequence of toxin 
202S(a). 

SEQ ID NO. 94 is an amino acid sequence of the toxin designated 213E5(a). 
SEQ ID NO. 95 is a nucleotide sequence encoding an amino acid sequence of toxin 
5 213E5(a). 

SEQ ID NO. 96 is an amino acid sequence of the toxin designated 21 8G2(a). 
SEQ ID NO. 97 is a nucleotide sequence encoding an amino acid sequence of toxin 
218G2(a).-- 

SEQ ID NO. 98 is an amino acid sequence of the toxin designated 29HD(a). 
10 SEQ ID NO. 99 is a nucleotide sequence encoding an amino acid sequence of toxin 

29HD(a). 

SEQ ID NO. 100 is an ammo acid sequence of the toxin designated 1 10HD(a). 
SEQ ID NO. 101 is a nucleotide sequence encoding an amino acid sequence of toxin 
110HD(a). 

15 SEQ ID NO. 102 is an amino acid sequence of the toxin designated 129HD(b). 

SEQ ID NO. 103 is a nucleotide sequence encoding an amino acid sequence of toxin 
129HD(b). 

SEQ ID NO. 104 is a partial amino acid sequence of the toxin designated 573HD(a). 
SEQ ID NO. 105 is a partial nucleotide sequence encoding an amino acid sequence of 
20 toxin 573HD(a). 

Detailed Disclosure of the Invention 
The subject invention concerns materials and methods for the control of non-mammalian 
pests. In specific embodiments, the subject invention pertains to new Bacillus ihuringiensis 
25 isolates and toxins which have activity against lepidopterans. In a particularly preferred 

embodiment, the toxins and methodologies described herein can be used to control black 
cutworm. The subject invention further concerns novel genes which encode pesticidal toxins 
and novel methods for identifying and characterizing B.t. genes which encode toxins with useful 
properties. The subject invention concerns not only the polynucleotide sequences which encode 
30 these toxins, but also the use of these polynucleotide sequences to produce recombinant hosts 

which express the toxins. 

Certain proteins of the subject invention are distinct from the crystal or "Cry" proteins 
which have previously been isolated from Bacillus ihuringiensis. 
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A further aspect of the subject invention concerns novel isolates and the toxins and 
genes obtainable from these isolates. The novel B.t. isolates of the subject invention have been 
designated PS31G1, PS185U2, PS11B, PS218G2, PS213E5, PS28C, PS86BB1, PS89J3, 
PS94R], PS202S, PS101DD, and PS27J2. 

The new toxins and polynucleotide sequences provided here are defined according to 
several parameters. One critical characteristic of the toxins described herein is pesticidal 
activity. In a specific embodiment, these toxins have activity against lepidopteran pests. The 
toxins and genes of the subject invention can be further defined by their ammo acid and 
nucleotide sequences. The sequences of the molecules can be defined in terms of homology to 
certain exemplified sequences as well as in terms of the ability to hybridize with, or be amplified 
by, certain exemplified probes and primers. The toxins provided herein can also be identified 
based on their immunoreactivity with certain antibodies. 

Methods have been developed for making useful chimeric toxins by combining portions 
of B.t. crystal proteins. The portions which are combined need not, themselves, be pesticidal so 
long as the combination of portions creates a chimeric protein which is pesticidal This can be 
done using restriction enzymes, as described in, for example, European Patent 0 228 838; Ge, 
A.Z., N.L. Shivarova, D.H. Dean (1989) Proc. Natl. Acad. Sci. USA 86:4037-4041 ; Ge, A.2., 
D. Rivers, R. Milne, D.H. Dean (1991)7. Biol. Chem. 266:17954-17958; Schnepf, H.E., K. 
Tomczak, J.P. Ortega, H.R. Whiteley (1990)/ Biol. Chem. 265:20923-20930; Honee, G. t D. 
Convents, J. Van Rie, S. Jansens, M. Peferoen, B. Visser (1991) Mol Microbiol. 5:2799-2806. 
Alternatively, recombination using cellular recombination mechanisms can be used to achieve 
similar results. See, for example, Caramon, T., A.M. Albertini, A. Galizzi (1991) Gene 98:37- 
44; Widner, W.R., H.R. Whiteley (1990) J. Bacteriol. 172:2826-2832; Bosch, D., B. Schipper, 
H. van der Kliej, R.A. de Maagd, W.J. Stickema (1994) Biotechnology 12:915-91 8. A number 
of other methods are known in the art by which such chimeric DNAs can be made. The subject 
invention is meant to include chimeric proteins that utilize the novel sequences identified in the 
subject application. 

With the teachings provided herein, one skilled in the art could readily produce and use 
the various toxins and polynucleotide sequences described herein. 

B.t. isolates useful according to the subject invention have been deposited in the 
permanent collection of the Agricultural Research Service Patent Culture Collection (NRRL), 
Northern Regional Research Center, 1815 North University Street, Peoria, Illinois 61604, USA. 
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The culture repository numbers of the B.t. strains are as follows: 


Culture 


Repository No. 


DeDOsit Date 


da. rb I 1 r> (M 11 /A) 




April 18, 1996 


da. rooODol {mil /j) 


INKKL t>-21}j / 


April 18, 1996 


D / DCC£\/1 fkATIl^ 
D. *. ri)OOVl (MiZ/Oj 




Apnl 18, 1996 


d .(. roooVV] (Mlz//) 


XTDD T D 1HCO 

NKKL r>-2 13:>9 


April 18, 1996 


da. rbJIU i j I /6) 


1NKKL b-z 15oU 


April 18, 1996 


D 4 DCOQT7 /K/TT7Q\ 

/*./. rboyJi (M 1 2 /9) 


XTD D T D TKi 1 


April 18, 1996 




NKKL B-21562 


April 18, 1996 


5./. PS27J2 


NRRLB- 


July 1, 1997 


5./. PS28E 


NRRLB- 


July 1J 997 


B.t. PS94R1 


NRRLB- 


July 1, 1997 


/?./. PS101DD 


NRRLB- 


July 1, 1997 


B.t. PS202S 


NRRLB- 


July 1, 1997 


5./. PS213E5 


NRRLB- 


July 1, 1997 


Af. PS218G2 


NRRLB- 


July 1, 1997 


£. co// NM522 (MR 922) 
( P MYC2451) o 


NRRLB-21794 


June 27, 1997 


£ co// NM522 (MR 923) 
(pMYC2453) 


NRRLB-21795 


June 27, 1997 


£ co// NM522 (MR 924) 
(pMYC2454) 


NRRLB-21796 


June 27, 1997 



Cultures which have been deposited for the purposes of this patent application were 
deposited under conditions that assure that access to the cultures is available during the 
pendency of this patent application to one determined by the Commissioner of Patents and 
Trademarks to be entitled thereto under 37 CFR 1 .14 and 35 U.S.C. 122. The deposits will be 
available as required by foreign patent laws in countries wherein counterparts of the subject 
application, or its progeny, are filed. However, it should be understood that the availability of 
a deposit does not constitute a license to practice the subject invention in derogation of patent 
rights granted by governmental action. 

Further, the subject culture deposits will be stored and made available to the public in 
accord with the provisions of the Budapest Treaty for the Deposit of Microorganisms, i.e., they 
will be stored with all the care necessary to keep them viable and uncontaminated for a period 
of at least five years after the most recent request for the furnishing of a sample of the deposit, 
and in any case, for a period of at least thirty (30) years after the date of deposit or for the 
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enforceable life of any patent which may issue disclosing the culrure(s). The depositor 
acknowledges the duty to replace the deposit(s) should the depository be unable to furnish a 
sample when requested, due to the condition of a deposit. All restrictions on the availability to 
the public of the subject culture deposits will be irrevocably removed upon the granting of a 
5 patent disclosing them. 

Following is a table which provides characteristics of certain isolates useful according 
to the subject invention. 



Table 1. Description of B.t. strains toxic to lepidopierans 



25 



Culture 


Crystal Description 


Approx. MW(kDa) 


Serotype 


PSI85U2 


small bipyramid 


!30kDa doublet, 70 kDa 


ND 


PSI1B 


btpyramid ton 


130 kDa, 70 kDa 


ND 


PS2I8G2 


amorphic 


135 kDa. 127 kDa 


ND 


PS2I3E5 


amorphic 


1 30 kDa 


ND 


PS86W1 


multiple amorphic 


130 kDa doublet 


5a5b gartcriae 


PS28C 


amorphic 


130 kDa triplet 


5a5b gattenae 


PS86BBI 


BP without 


130 kDa doublet 


5a5b gaueriae 


PS89J3 


sphcrical/amorphic 


130 kDa doublet 


ND 


PS86V1 


BP 


130 kDa doublet 


ND 


PS94RI 


BP and amorphic 


130 kDa doublet 


ND 


HD525 


BP and amorphic 


130 kDa 


not motile 


HD573 


multiple amorphic 


135 kDa, 79 kDa doublet, 72 kDa 


not motile 


PS27J2 


lemon-shaped 


130 kDa 50 kDa 


4 (sotto or kenyae) 


ND = not determined 



In one embodiment, the subject invention concerns materials and methods including 
nucleotide primers and probes for isolating and identifying Bacillus thuringiensis {B.t.) genes 
encoding protein toxins which are active against lepidopteran pests. The nucleotide sequences 
30 described herein can also be used to identify new pesticidal B.t. isolates. The invention further 
concerns the genes, isolates, and toxins identified using the methods and materials disclosed 
herein. 

Genes and toxins . The genes and toxins useful according to the subject invention 
include not only the full length sequences but also fragments of these sequences, variants, 
35 mutants, and fusion proteins which retain the characteristic pesticidal activity of the toxins 

specifically exemplified herein. Chimeric genes and toxins, produced by combining portions 
from more than one B.t. toxin or gene, may also be utilized according to the teachings of the 
subject invention. As used herein, the terms "variants" or "variations" of genes refer to 
nucleotide sequences which encode the same toxins or which encode equivalent toxins having 
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pesticidal activity. As used herein, the term "equivalent toxins" refers to toxins having the same 
or essentially the same biological activity against the target pests as the exemplified toxins. 

It should be apparent to a person skilled in this art that genes encoding active toxins can 
be identified and obtained through several means. The specific genes exemplified herein may 
5 be obtained from the isolates deposited at a culture depository as described above. These genes, 

or portions or variants thereof, may also be constructed synthetically, for example, by use of a 
gene synthesizer. Variations of genes may be readily constructed using standard techniques for 
making point mutations. Also, fragments of these genes can be made using commercially 
available exonucleases or endonucleases according to standard procedures. For example, 

10 enzymes such as Zto/31 or site-directed mutagenesis can be used to systematically cut off 

nucleotides from the ends of these genes. Also, genes which encode active fragments may be 
obtained using a variety of restriction enzymes. Proteases may be used to directly obtain active 
fragments of these toxins. 

Equivalent toxins and/or genes encoding these equivalent toxins can be derived from 

15 B.t. isolates and/or DNA libraries using the teachings provided herein. There are a number of 

methods for obtaining the pesticidal toxins of the instant invention. For example, antibodies to 
the pesticidal toxins disclosed and claimed herein can be used to identify and isolate other toxins 
from a mixture of proteins. Specifically, antibodies may be raised to the portions of the toxins 
which are most constant and most distinct from other Ba. toxins. These antibodies can then be 

20 used to specifically identify equivalent toxins with the characteristic activity by 

immunoprecipitation, enzyme linked immunosorbent assay (ELISA), or western blotting. 
Antibodies to the toxins disclosed herein, or to equivalent toxins, or fragments of these toxins, 
can readily be prepared using standard procedures in this art. The genes which encode these 
toxins can then be obtained from the microorganism. 

25 Fragments and equivalents which retain the pesticidal activity of the exemplified toxins 

would be within the scope of the subject invention. Also, because of the redundancy of the 
genetic code, a variety of different DNA sequences can encode the amino acid sequences 
disclosed herein. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences encoding the same, or essentially the same, toxins. These variant 

30 DNA sequences are within the scope of the subject invention. As used herein, reference to 

"essentially the same" sequence refers to sequences which have amino acid substitutions, 
deletions, additions, or insertions which do not materially affect pesticidal activity. Fragments 
retaining pesticidal activity are also included in this definition. 
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A further method for identifying the toxins and genes of the subject invention is through 
the use of oligonucleotide probes. These probes are detectable nucleotide sequences. Probes 
provide a rapid method for identifying toxin-encoding genes of the subject invention. The 
nucleotide segments which are used as probes according to the invention can be synthesized 
using a DNA synthesizer and standard procedures. 

Certain toxins of the subject invention have been specifically exemplified herein. Since 
these toxins are merely exemplary of the toxins of the subject invention , it should be readily 
apparent that the subject invention comprises variant or equivalent toxins (and nucleotide 
sequences coding for equivalent toxins) having the same or similar pesticidal activity of the 
exemplified toxin. Equivalent toxins will have amino acid homology with an exemplified toxin. 
This amino acid identity will typically be greater than 60%, preferably be greater than 75%, 
more preferably greater than 80%, more preferably greater than 90%, and can be greater than 
95%. The amino acid homology will be highest in critical regions of the toxin which account 
for biological activity or are involved in the determination of three-dimensional configuration 
which ultimately is responsible for the biological activity. In this regard, certain amino acid 
substitutions are acceptable and can be expected if these substitutions are in regions which are 
not critical to activity or are conservative amino acid substitutions which do not affect the three- 
dimensional configuration of the molecule. For example, amino acids may be placed in the 
following classes: non-polar, uncharged polar, basic, and acidic. Conservative substitutions 
whereby an amino acid of one class is replaced with another amino acid of the same type fall 
within the scope of the subject invention so long as the substitution does not materially alter the 
biological activity of the compound. Table 2 provides a listing of examples of ammo acids 



belonging to each class. 


Table 2. 


Class of Amino Acid 


Examples of Amino Acids 


Nonpolar 


Ala, Val, Leu, He, Pro, Met, Phe, Trp 


Uncharged Polar 


Gly, Ser, Thr, Cys, Tyr, Asn, Gin 


Acidic 


Asp, Glu 


Basic 


Lys, Arg, His 
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In some instances, non-conservatjve substitutions can also be made. The critical factor 
is that these substitutions must not significantly detract from the biological activity of the toxin. 

The toxms of the subject invention can also be characterized in terms of the shape and 
location of toxin inclusions, which are described above. 
5 As used herein, reference to "isolated" polynucleotides and/or "purified" toxins refers 

to these molecules when they are not associated with the other molecules with which they would 
be found in nature. Thus, "purified" toxins would include, for example, the subject toxins 
expressed in plants. Reference to "isolated and purified" signifies the involvement of the "hand 
of man" as described herein. Chimeric toxins and genes also involve the "hand of man." 

10 Recombinant hosts . The toxin-encoding genes harbored by the isolates of the subject 

invention can be introduced into a wide variety of microbial or plant hosts. Expression of the 
toxin gene results, directly or indirectly, in the intracellular production and maintenance of the 
pesticide. With suitable microbial hosts, e.g., Pseudomonas, the microbes can be applied to the 
situs of the pest, where they will proliferate and be ingested. The result is a control of the pest. 

15 Alternatively, the microbe hosting the toxin gene can be treated under conditions that prolong 

the activity of the toxin and stabilize the cell. The treated cell, which retains the toxic activity, 
then can be applied to the environment of the target pest. 

Where the B.t. toxin gene is introduced via a suitable vector into a microbial host, and 
said host is applied to the environment in a living state, it is essential that certain host microbes 

20 be used. Microorganism hosts are selected which are known to occupy the i4 phytosphere" 

(phylioplane, phyllosphere, rhizosphere, and/or rhizoplane) of one or more crops of interest. 
These microorganisms are selected so as to be capable of successfully competing in the 
particular environment (crop and other insect habitats) with the wild-type microorganisms, 
provide for stable maintenance and expression of the gene expressing the polypeptide pesticide, 

25 and, desirably, provide for improved protection of the pesticide from environmental degradation 

and inactivation. 

A large number of microorganisms are known to inhabit the phylioplane (the surface 
of the plant leaves) and/or the rhizosphere (the soil surrounding plant roots) of a wide variety 
of important crops. These microorganisms include bacteria, algae, and fungi. Of particular 
30 interest are microorganisms, such as bacteria, e.g., genera Pseudomonas. Erwinia, Serrano, 

Klebsiella, Xanthomonas, Streptomyces, Rhizobium, Rhodopseudomonas \ Methylophilius. 
Agrobacterium, Acetobacter, Lactobacillus, Arthrobacter, Azotobacter, Leuconostoc, and 
Alcaligenes; fungi, particularly yeast, e.g., genera Saccharomyces , Cryptococcus. 
KluyveromyceSy Sporobolomyces , Rhodotorula, and Aureobasidiwn. Of particular interest are 
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such phytosphere bacteria) species as Pseudomonas syringae, Pseudomonas JJuorescens, 
Serratia marcescens, Acetobacter xylinum, Agrobacterium tumefaciens, Rhodopseudomonas 
spheroides, Xanthomonas campesths, Rhizobium melioti, Alcaligenes entrophus, and 
Azotobacter vinlandii; and phytosphere yeast species such as Rhodotorula rubra, R. glutinis, R. 
marina, R. aurantiaca, Cryptococcus albidus, C. diffluens, C. laurentii, Saccharomyces rosei, 
S. pretoriensis, S. cerevisiae, Sporobolomyces roseus, S. odorus, Kluyveromyces veronae, and 
Aureobasidium pollulans. Of particular interest are the pigmented microorganisms. 

-A wide variety of ways are available for introducing a B.t. gene encoding a toxin into 
a microorganism host under conditions which allow for stable maintenance and expression of 
the gene. These methods are well known to those skilled in the art and are described, for 
example, in United States Patent No. 5,135,867, which is incorporated herein by reference. 

Control of lepidopterans, including black cutworm, using the isolates, toxins, and genes 
of the subject invention can be accomplished by a variety of methods known to those skilled in 
the art. These methods include, for example, the application of B.t. isolates to the pests (or their 
location), the application of recombinant microbes to the pests (or their locations), and the 
transformation of plants with genes which encode the pesticidal toxins of the subject invention. 
Recombinant microbes may be, for example, a B.t., E. coli, or Pseudomonas. Transformations 
can be made by those skilled in the art using standard techniques. Materials necessary for these 
transformations are disclosed herein or are otherwise readily available to the skilled artisan. 

Synthetic genes which are functionally equivalent to the toxins of the subject invention 
can also be used to transform hosts. Methods for the production of synthetic genes can be found 
in, for example, U.S. Patent No. 5,380,831. 

Treatment of cells . As mentioned above, B.t. or recombinant cells expressing a B.t. 
toxin can be treated to prolong the toxin activity and stabilize the cell. The pesticide 
microcapsule that is formed comprises the B.t. toxin within a cellular structure that has been 
stabilized and will protect the toxin when the microcapsule is applied to the environment of the 
target pest. Suitable host cells may include either prokaryotes or eukaryotes, normally being 
limited to those cells which do not produce substances toxic to higher organisms, such as 
mammals. However, organisms which produce substances toxic to higher organisms could be 
used, where the toxic substances are unstable or the level of application sufficiently low as to 
avoid any possibility of toxicity to a mammalian host. As hosts, of particular interest will be 
the prokaryotes and the lower eukaryotes, such as fungi. 

The cell will usually be intact and be substantially in the proliferative form when 
treated, rather than in a spore form, although in some instances spores may be employed. 
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Treatment of the microbial cell, e.g., a microbe containing the B.t. toxin gene, can be 
by chemical or physical means, or by a combination of chemical and/or physical means, so long 
as the technique does not deleteriously affect the properties of the toxin, nor dimmish the 
cellular capability of protecting the toxin. Examples of chemical reagents are halogenating 
5 agents, particularly halogens of atomic no. 1 7-80. More particularly, iodine can be used under 

mild conditions and for sufficient time to achieve the desired results. Other suitable techniques 
include treatment with aldehydes, such as glutaraldehyde; anti-infectives, such as zephiran 
chloride and cetylpyndinium chloride; alcohols, such as isopropyl and ethanol; various 
histologic fixatives, such as Lugol iodine, Bouin's fixative, various acids and Helly's fixative 

10 (See: Humason, Gretchcn L., Animal Tissue Techniques, W.H. Freeman and Company, 1967); 

or a combination of physical (heat) and chemical agents that preserve and prolong the activity 
of the toxin produced in the cell when the cell is administered to the host environment. 
Examples of physical means are short wavelength radiation such as gamma-radiation and X- 
radiation, freezing, UV irradiation, lyophilization, and the like. Methods for treatment of 

15 microbial cells are disclosed in United States Patent Nos. 4,695,455 and 4,695,462, which are 

incorporated herein by reference. 

The cells generally will have enhanced structural stability which will enhance resistance 
to environmental conditions. Where the pesticide is in a proform, the method of cell treatment 
should be selected so as not to inhibit processing of the proform to the mature form of the 

20 pesticide by the target pest pathogen. For example, formaldehyde will crosslink proteins and 

could inhibit processing of the proform of a polypeptide pesticide. The method of treatment 
should retain at least a substantial portion of the bio-availabihty or bioactivity of the toxin. 

Characteristics of particular interest in selecting a host cell for purposes of production 
include ease of introducing the B.t. gene into the host, availability of expression systems, 

25 efficiency of expression, stability of the pesticide in the host, and the presence of auxiliary 

genetic capabilities. Characteristics of interest for use as a pesticide microcapsule include 
protective qualities for the pesticide, such as thick cell walls, pigmentation, and intracellular 
packaging or formation of inclusion bodies; survival in aqueous environments; lack of 
mammalian toxicity; attractiveness to pests for ingestion; ease of killing and fixing without 

30 damage to the toxin; and the like. Other considerations include ease of formulation and 

handling, economics, storage stability, and the like. 

Growth of cells . The cellular host containing the B.t. insecticidal gene may be grown 
in any convenient nutrient medium, where the DNA construct provides a selective advantage, 
providing for a selective medium so that substantially all or all of the cells retain the B.t. gene. 



WO 98/00546 



PCT/US97/11658 



19 

These cells may then be harvested in accordance with conventional ways. Alternatively, the 
cells can be treated prior to harvesting. 

The B.t. cells of the invention can be cultured using standard art media and fermentation 
techniques. Upon completion of the fermentation cycle the bacteria can be harvested by first 
separating the B.t. spores and crystals from the fermentation broth by means well known in the 
art. The recovered B.t. spores and crystals can be formulated into a wettable powder, liquid 
concentrate, granules or other formulations by the addition of surfactants, dispersants, inert 
carriers, and other components to facilitate handling and application for particular target pests. 
These formulations and application procedures are all well known in the art. 

Methods and formulations for control of pests . Control of lepidopterans using the 
isolates, toxins, and genes of the subject invention can be accomplished by a variety of methods 
known to those skilled in the art. These methods include, for example, the application of B.t. 
isolates to the pests (or their location), the application of recombinant microbes to the pests (or 
their locations), and the transformation of plants with genes which encode the pesticidal toxins 
of the subject invention. Recombinant microbes may be, for example, a B.t., E. coli y or 
Pseudomonas. Transformations can be made by those skilled in the art using standard 
techniques. Materials necessary for these transformations are disclosed herein or are otherwise 
readily available to the skilled artisan. 

Formulated bait granules containing an attractant and spores and crystals of the B.t. 
isolates, or recombinant microbes comprising the genes obtainable from the B.t. isolates 
disclosed herein, can be applied to the soil. Formulated product can also be applied as a seed- 
coating or root treatment or total plant treatment at later stages of the crop cycle. Plant and soil 
treatments of B.t. cells may be employed as wettable powders, granules or dusts, by mixing with 
various inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, 
phosphates, and the like) or botanical materials (powdered corncobs, nee hulls, walnut shells, 
and the like). The formulations may include spreader-sticker adjuvants, stabilizing agents, other 
pesticidal additives, or surfactants. Liquid formulations may be aqueous-based or non-aqueous 
and employed as foams, gels, suspensions, emulsifiable concentrates, or the like. The 
ingredients may include rheological agents, surfactants, emulsifiers, dispersants, or polymers. 

As would be appreciated by a person skilled in the art, the pesticidal concentration will 
vary widely depending upon the nature of the particular formulation, particularly whether it is 
a concentrate or to be used directly. The pesticide will be present in at least 1% by weight and 
may be 100% by weight. The dry formulations will have from about 1-95% by weight of the 
pesticide while the liquid formulations will generally be from about 1-60% by weight of the 
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solids in the liquid phase. The formulations will generally have from about I0 2 to about 10* 
cells/mg. These formulations will be administered at about 50 mg (liquid or dry) to 1 kg or 
more per hectare. 

The formulations can be applied to the environment of the pest, e.g., soil and foliage, 
5 by spraying, dusting, sprinkling, or the like. 

Mutants . Mutants of the isolates of the invention can be made by procedures well 
known in the art. For example, an asporogenous mutant can be obtained through ethylmethane 
sulfonate (EMS) mutagenesis of an isolate. The mutants can be made using ultraviolet light and 
nitrosoguanidine by procedures well known in the art. 

10 A smaller percentage of the asporogenous mutants will remain intact and not lyse for 

extended fermentation periods; these strains are designated lysis minus (-). Lysis minus strains 
can be identified by screening asporogenous mutants in shake flask media and selecting those 
mutants that are still intact and contain toxin crystals at the end of the fermentation. Lysis 
minus strains are suitable for a cell treatment process that will yield a protected, encapsulated 

15 toxin protein. 

To prepare a phage resistant variant of said asporogenous mutant, an aliquot of the 
phage lysate is spread onto nutrient agar and allowed to dry. An aliquot of the phage sensitive 
bacterial strain is then plated directly over the dried lysate and allowed to dry. The plates are 
incubated at 30°C. The plates are incubated for 2 days and, at that time, numerous colonies 

20 could be seen growing on the agar. Some of these colonies are picked and subcultured onto 

nutrient agar plates. These apparent resistant cultures are tested for resistance by cross streaking 
with the phage lysate. A tine of the phage lysate is streaked on the plate and allowed to dry. 
The presumptive resistant cultures are then streaked across the phage line. Resistant bacterial 
cultures show no lysis anywhere in the streak across the phage line after overnight incubation 

25 at 30°C. The resistance to phage is then reconfirmed by plating a lawn of the resistant culture 

onto a nutrient agar plate. The sensitive strain is also plated in the same manner to serve as the 
positive control. After drying, a drop of the phage lysate is placed in the center of the plate and 
allowed to dry. Resistant cultures showed no lysis in the area where the phage lysate has been 
placed after incubation at 30°C for 24 hours. 

30 Polynucleotide probes . It is well known that DNA possesses a fundamental property 

called base complementarity. In nature, DNA ordinarily exists in the form of pairs of anti- 
parallel strands, the bases on each strand projecting from that strand toward the opposite strand. 
The base adenine (A) on one strand will always be opposed to the base thymine (T) on the other 
strand, and the base guanine (G) will be opposed to the base cytosine (C). The bases are held 
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in apposition by their ability to hydrogen bond in this specific way. Though each individual 
bond is relatively weak, the net effect of many adjacent hydrogen bonded bases, together with 
base stacking effects, is a stable joining of the two complementary strands. These bonds can be 
broken by treatments such as high pH or high temperature, and these conditions result in the 
5 dissociation, or "denaturation," of the two strands. If the DNA is then placed in conditions 

which make hydrogen bonding of the bases thermodynamically favorable, the DNA strands will 
anneal, or "hybridize," and reform the original double stranded DNA. If earned out under 
appropriate conditions, this hybridization can be highly specific. That is, only strands with a 
high degree of base complementarity will be able to form stable double stranded structures. The 

10 relationship of the specificity of hybridization to reaction conditions is well known. Thus, 

hybridization may be used to test whether two pieces of DNA are complementary in their base 
sequences. It is this hybridization mechanism which facilitates the use of probes of the subject 
invention to readily detect and characterize DNA sequences of interest. 

The probes may be RNA or DNA. The probe will normally have at least about 10 bases, 

15 more usually at least about 18 bases, and may have up to about 50 bases or more, usually not 

having more than about 200 bases if the probe is made synthetically. However, longer probes 
can readily be utilized, and such probes can be, for example, several kilobases in length. The 
probe sequence is designed to be at least substantially complementary to a gene encoding a toxin 
of interest. The probe need not have perfect complementarity to the sequence to which it 

20 hybridizes. The probes may be labelled utilizing techniques which are well known to those 

skilled in this art. 

One approach for the use of the subject invention as probes entails first identifying by 
Southern blot analysis of a gene bank of the B.t. isolate all DNA segments homologous with the 
disclosed nucleotide sequences. Thus, it is possible, without the aid of biological analysis, to 

25 know in advance the probable activity of many new B.t isolates, and of the individual endotoxin 

gene products expressed by a given B.t. isolate. Such a probe analysis provides a rapid method 
for identifying potentially commercially valuable insecticidal endotoxin genes within the 
multifarious subspecies of B.t 

One hybridization procedure useful according to the subject invention typically includes 

30 the initial steps of isolating the DNA sample of interest and purifying it chemically. Either lysed 

bacteria or total fractionated nucleic acid isolated from bacteria can be used. Cells can be 
treated using known techniques to liberate their DNA (and/or RNA). The DNA sample can be 
cut into pieces with an appropriate restriction enzyme. The pieces can be separated by size 
through electrophoresis in a gel, usually agarose or acrylamide. The pieces of interest can be 
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transferred to an immobilizing membrane in a manner that retains the geometry of the pieces. 
The membrane can then be dried and prehybridized to equilibrate it for later immersion in a 
hybridization solution. The manner in which the nucleic acid is affixed to a solid support may 
vary. This fixing of the DNA for later processing has great value for the use of this technique 
5 in field studies, remote from laboratory facilities. 

The particular hybridization technique is not essential to the subject invention. As 
improvements are made in hybridization techniques, they can be readily applied. 

As is well known in the art, if the probe molecule and nucleic acid sample hybridize by 
forming a strong non-covalent bond between the two molecules, it can be reasonably assumed 

10 that the probe and sample are essentially identical. The probe's detectable label provides a 

means for determining in a known manner whether hybridization has occurred. 

The nucleotide segments of the subject invention which are used as probes can be 
synthesized by use of DNA synthesizers using standard procedures. In the use of the nucleotide 
segments as probes, the particular probe is labeled with any suitable label known to those skilled 

15 in the art, including radioactive and non-radioactive labels. Typical radioactive labels include 

32 P, 35 S, or the like. A probe labeled with a radioactive isotope can be constructed from a 
nucleotide sequence complementary to the DNA sample by a conventional nick translation 
reaction, using a DNase and DNA polymerase. The probe and sample can then be combined m 
a hybridization buffer solution and held at an appropriate temperature until annealing occurs. 

20 Thereafter, the membrane is washed free of extraneous materials, leaving the sample and bound 

probe molecules typically detected and quantified by autoradiography and/or liquid scintillation 
counting. For synthetic probes, it may be most desirable to use enzymes such as polynucleotide 
kinase or terminal transferase to end-label the DNA for use as probes. 

Non-radioactive labels include, for example, ligands such as biotin or thyroxine, as well 

25 as enzymes such as hydrolases or perixodases, or the various chemiluminescers such as 

luciferin, or fluorescent compounds like fluorescein and its derivatives. The probes may be 
made inherently fluorescent as described in International Application No. WO93/16094. The 
probe may also be labeled at both ends with different types of labels for ease of separation, as, 
for example, by using an isotopic label at the end mentioned above and a biotin label at the other 

30 end. 

The amount of labeled probe which is present in the hybridization solution will vary 
widely, depending upon the nature of the label, the amount of the labeled probe which can 
reasonably bind to the filter, and the stringency of the hybridization. Generally, substantial 
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excesses of the probe will be employed to enhance the rate of binding of the probe to the fixed 
DNA. 

Various degrees of stringency of hybridization can be employed. The more severe the 
conditions, the greater the complementarity that is required for duplex formation. Severity can 
5 be controlled by temperature, probe concentration, probe length, ionic strength, time, and the 

like. Preferably, hybridization is conducted under stringent conditions by techniques well 
known in the art, as described, for example, in Keller, G.H., M.M. Manak (1987) DNA Probes, 
Stockton Press, New York, NY., pp. 169-170. 

As used herein "stringent" conditions for hybridization refers to conditions which 

10 achieve the same, or about the same, degree of specificity of hybridization as the conditions 

employed by the current applicants. Specifically, hybridization of immobilized DNA on 
Southern blots with 32P-labeled gene-specific probes was performed by standard methods 
(Maniatis, T M E.F. Fntsch, J. Sambrook [1982] Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, NY)- In general, hybridization and subsequent 

1 5 washes were carried out under stringent conditions that allowed for detection of target sequences 

with homology to the exemplified toxin genes. For double-stranded DNA gene probes, 
hybridization was earned out overnight at 20-25° C below the melting temperature (Tm) of the 
DNA hybrid in 6X SSPE, 5X Denhardt's solution, 0. 1% SDS, 0.1 mg/ml denatured DNA. The 
melting temperature is described by the following formula (Beltz, G.A., K.A. Jacobs, T.H. 

20 Eickbush, P.T. Cherbas, and F.C. Kafatos [1983] Methods of Enzymology, R. Wu, L. Grossman 

and K. Moldave [eds.] Academic Press, New York 100:266-285). 

Tm=81.5° C+16.6 Log[Na+]+0.41(%G+C)-0.61(%formamide)-600/Iength of duplex 

in base pairs. 

Washes are typically carried out as follows: 
25 (l) Twice at room temperature for 15 minutes in IX SSPE, 0.1% SDS (low 

stringency wash). 

(2) Once at Tm-20°C for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate 
stringency wash). 

For oligonucleotide probes, hybridization was carried out overnight at 10-20°C below 
30 the melting temperature (Tm) of the hybrid in 6X SSPE, 5X Denhardt's solution, 0. 1 % SDS, 0. 1 

mg/ml denatured DNA. Tm for oligonucleotide probes was determined by the following 
formula: 

Tm (° C)=2(number T/A base pairs) +4(number G/C base pairs) 
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(Suggs, S.V., T. Miyake, E.H. Kawashime, M.J. Johnson, K. Itakura, and R.B. Wallace [1981] 
ICN-UCLA Symp. Dev. Biol Using Purified Genes, D.D. Brown [ed.}> Academic Press, New 
York, 23:683-693). 

Washes were typically carried out as follows: 

(1 ) Twice at room temperature for 1 5 minutes IX SSPE, 0.1% SDS (low stringency 
wash). 

(2) Once at the hybridization temperature for 1 5 minutes in IX SSPE, 0. 1 % SDS 
(moderate stringency wash). 

Duplex formation and stability depend on substantial complementarity between the two 
strands of a hybrid, and, as noted above, a certain degree of mismatch can be tolerated. 
Therefore, the nucleotide sequences of the subject invention include mutations (both single and 
multiple), deletions, insertions of the described sequences, and combinations thereof, wherein 
said mutations, insertions and deletions permit formation of stable hybrids with the target 
polynucleotide of interest. Mutations, insertions, and deletions can be produced in a given 
polynucleotide sequence in many ways, and these methods are known to an ordinarily skilled 
artisan. Other methods may become known in the future. 

The known methods include, but are not limited to: 

( 1 ) synthesizing chemically or otherwise an artificial sequence which is a mutation, 
insertion or deletion of the known sequence; 

(2) using a nucleotide sequence of the present invention as a probe to obtain via 
hybridization a new sequence or a mutation, insertion or deletion of the probe 
sequence; and 

(3) mutating, inserting or deleting a test sequence in vitro or in vivo. 

It is important to note that the mutational, insernonal, and deletional variants generated 
from a given probe may be more or less efficient than the original probe. Notwithstanding such 
differences in efficiency, these variants are within the scope of the present invention. 

Thus, mutational, insertional, and deletional variants of the disclosed nucleotide 
sequences can be readily prepared by methods which are well known to those skilled in the art. 
These variants can be used in the same manner as the exemplified primer sequences so long as 
the variants have substantial sequence homology with the original sequence. As used herein, 
substantial sequence homology refers to homology which is sufficient to enable the variant to 
function in the same capacity as the original probe. Preferably, this homology is greater than 
50%; more preferably, this homology is greater than 75%; and most preferably, this homology 
is greater than 90%. The degree of homology needed for the variant to function in its intended 
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capacity will depend upon the intended use of the sequence. It is well within the skill of a 
person trained in this art to make mutational, insertiona), and deletional mutations which are 
designed to improve the function of the sequence or otherwise provide a methodological 
advantage. 

PCR technology . Polymerase Chain Reaction (PCR) is a repetitive, enzymatic, primed 
synthesis of a nucleic acid sequence. This procedure is well known and commonly used by 
those skilled in this art (see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki, 
Randall K., Stephen Scharf, Fred Faloona r Kary B. Mullis, Glenn T. Horn, Henry A. Erlich, 
Norman Amheim [1985] "Enzymatic Amplification of P-Globin Genomic Sequences and 
Restriction Site Analysis for Diagnosis of Sickle Cell Anemia," Science 230:1350-1354.). PCR 
is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two 
oligonucleotide primers that hybridize to opposite strands of the target sequence. The primers 
are oriented with the 3' ends pointing towards each other. Repeated cycles of heat denaturation 
of the template, annealing of the primers to their complementary sequences, and extension of 
the annealed primers with a DNA polymerase result in the amplification of the segment defined 
by the 5 ' ends of the PCR primers. Since the extension product of each primer can serve as a 
template for the other primer, each cycle essentially doubles the amount of DNA fragment 
produced in the previous cycle. This results in the exponential accumulation of the specific 
target fragment, up to several million-fold in a few hours. By using a thermostable DNA 
polymerase such as Taq polymerase, which is isolated from the thermophilic bacterium Thermus 
aquaticus, the amplification process can be completely automated. 

The DNA sequences of the subject invention can be used as primers for PCR 
amplification. In performing PCR amplification, a certain degree of mismatch can be tolerated 
between primer and template. Therefore, mutations, deletions, and insertions (especially 
additions of nucleotides to the 5' end) of the exemplified primers fall within the scope of the 
subject invention. Mutations, insertions and deletions can be produced in a given primer by 
methods known to an ordinarily skilled artisan. It is important to note that the mutational, 
insertional, and deletional variants generated from a given pnmer sequence may be more or less 
efficient than the original sequences. Notwithstanding such differences in efficiency, these 
variants are within the scope of the present invention. 

Following are examples which illustrate procedures for practicing the invention. These 
examples should not be construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 
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Example ) - Culnjring of B.t. Isolates Useful According to the Invention 

A subculture of B.t. isolates, or mutants thereof, can be used to inoculate the following 
peptone, glucose, salts medium: 

Bacto Peptone 7.5 g/I 

5 Glucose 1.0 g/1 

KH 2 PO< 3.4 g/1 

K 2 HP0 4 4.35 g/1 

Salt Solution 5.0 ml/1 

CaCl 2 Solution 5.0 ml/1 

10 pH 7.2 

Salts Solution (100 ml) 

MgSCV7H 2 0 2.46 g 

MnS0 4 H 2 0 0.04 g 

15 ZnS0 4 -7H,0 0.28 g 

FeS0 4 -7H 2 0 0.40 g 

CaCl 2 Solution (100 ml) 

CaCl 2 '2H 2 0 3.66 g 

20 

The salts solution and CaCl 2 solution are filter-sterilized and added to the autoclaved 
and cooked broth at the time of inoculation. Flasks are incubated at 30 °C on a rotary shaker at 
200 rpm for 64 nr. 

The above procedure can be readily scaled up to large fermentors by procedures well 
25 known in the art. 

The B.t. spores and/or crystals, obtained in the above fermentation, can be isolated by 
procedures well known in the art. A frequently-used procedure is to subject the harvested 
fermentation broth to separation techniques, e.g., centrifugation. 

Alternatively, a subculture of B.t. isolates, or mutants thereof, can be used to inoculate 
30 the following medium, known as TB broth: 
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Tryptone 



12 



g/1 
g/1 
g/l 



Yeast Extract 



24 



Glycerol 



4 



KH 2 P0 4 
K 2 HPO< 



2.1 



g/1 
g/1 



14.7 



pH7.4 



The potassium phosphate was added to the autoclaved broth after cooling. Flasks were 
incubated at 30°C on a rotary shaker at 250 rpm for 24-36 hours. 

The above procedure can be readily scaled up to large fermentors by procedures well 
known in the art. 

The B.t. obtained in the above fermentation, can be isolated by procedures well known 
in the art. A frequently-used procedure is to subject the harvested fermentation broth to 
separation techniques, e.g., centrifugation. In a specific embodiment, B.t. proteins useful 
according the present invention can be obtained from the supernatant. The culture supernatant 
containing the active protein(s) was used in bioassays as discussed below. 

Example 2 - Identification of Genes Encoding Novel Lepidopteran- Active Bacillus thuringiensis 
Toxins 

Two primer pairs useful for the identification and classification of novel toxin genes by 
PCR amplification of polymorphic DNA fragments near the 3' ends of B.t. toxin genes were 
designed. These oligonucleotide primers allow the discrimination of genes encoding toxins in 
the Cry7, Cry8, or Cry9 subfamilies from genes for the more common lepidopteran-active toxins 
in the Cryl subfamily based on size differences for the amplified DNA. The sequences of these 
primers are: 

Forward 1 5 ' CGTGGCTATATCCTTCGTGTY AC 3 ' (SEQ ID NO. 1 ) 
Reverse 1 5 ' ACRATRAATGTTCCTTCYGTTTC 3 ' (SEQ ID NO. 2) 
Forward 2 5' GGATATGTMTTACGTGTAACWGC 3' (SEQ ID NO. 3) 
Reverse 2 5 ' CTACACTTTCTATRTTG AATRYACCTTC 3 ' (SEQ ID NO. 4) 

Standard PCR amplification (Perkin Elmer, Foster City, CA) using primer pair 1 (SEQ 
ID NOS. I and 2) of the subject invention yields DNA fragments approximately 415-440 base 
pairs in length from B.t. toxin genes related to the cry! subfamily. 
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PCR amplification using pnmer pair 2 (SEQ ID NOS. 3 and 4) according to the subject 
invention yields DNA fragments approximately 230-290 base pairs in length from cry7, cryS. 
or cry9 subfamily toxin genes. 

These primers can be used according to the subject invention to identify genes encoding 
5 novel toxins. Crude DNA templates for PCR were prepared from B.i. strains. A loopful of cells 

was scraped from an overnight piate culture of Bacillus thuringiensis and resuspended in 300 
ml TE buffer (10 mM Tris-Cl, 1 mM EDTA, pH 8.0). Proteinase K was added to 0.1 mg/ml and 
the cell suspension was heated to 55°C for 15 minutes. The suspension was then boiled for 15 
minutes. Cellular debris was pelleted in a microfuge and the supernatant containing the DNA 
10 was transferred to a clean tube. 

PCR was carried out using the primer pair consisting of the Forward 2 (SEQ ID NO. 3) 
and Reverse 2 (SEQ ID NO. 4) oligonucleotides described above. Strains were identified that 
contained genes characterized by amplification of DNA fragments approximately 230-290 bp 
in length. Spore-crystal preparations from these strains were subsequently tested for bioactivity 
15 against Agrotis ipsilon and additional lepidopteran targets. 

PS185U2 was examined using both primer pairs 1 and 2 (SEQ ID NOS. 1 and 2 and 
SEQ ID NOS. 3 and 4, respectively). In this strain, primer pair 1 (SEQ ID NOS. 1 and 2) 
yielded a DNA band of the size expected for toxin genes related to the cry} subfamily. 

20 Example 3 - Restriction F ragment Length Polymorphism (RFLP) Analysis of Bacillus 

thuringiensis Toxin Genes Present in Lepidopteran-Active Strains 

Total cellular DNA was prepared from Bacillus thuringiensis (B.t.) strains grown to an 
optical density, at 600 nm, of 1.0. Cells were pelleted by centrifugation and resuspended in 
protoplast buffer (20 mg/ml lysozyme in 0.3 M sucrose, 25 mM Tris-Cl [pH 8.0], 25 mM 

25 EDTA). After incubation at 37°C for 1 hour, protoplasts were lysed by two cycles of freezing 

and thawing. Nine volumes of a solution of 0.1 M NaCl, 0.1% SDS, 0.1 M Tns-Cl were added 
to complete lysis. The cleared lysate was extracted twice with phenol ichloro form (1:1). Nucleic 
acids were precipitated with two volumes of ethanol and pelleted by centrifugation. The pellet 
was resuspended in TE buffer and RNase was added to a final concentration of 50 g/ml. After 

30 incubation at 37°C for 1 hour, the solution was extracted once each with phenolxhloroform 

(1:1) and TE-saturated chloroform. DNA was precipitated from the aqueous phase by the 
addition of one-tenth volume of 3M NaOAc and two volumes of ethanol. DNA was pelleted by 
centrifugation, washed with 70% ethanol, dried, and resuspended in TE buffer. 
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Two types of PCR-arnplified, )2 P-labeled DNA probes were used in standard Southern 
hybridizations of total cellular B.t. DNA to characterize toxin genes by Rf LP. The first probe 
(A) was a DNA fragment amplified using the following primers: 
Forward 3: 5' CCAGWTTTAYAGGAGG 3' (SEQ ID NO. 5) 
5 Reverse 3:5' GTAAACAAGCTCGCCACCGC 3' (SEQ ID NO. 6) 

The second probe (B) was either the 230-290 bp or 41 5-440 bp DNA fragment amplified 
with the primers described in the previous example. - 

Hybridization of immobilized DNA on Southern blots with the aforementioned 
32 P-labeled probes was performed by standard methods (Maniatis, T., E.F. Fntsch, J. Sambrook 
10 [1 982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring 

Harbor, NY). In general, hybridization and subsequent washes were carried out under moderate 
stringency. For double-stranded DNA gene probes, hybridization was carried out overnight at 
20-25 °C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE, 5X Denhardt's 
solution, 0.1% SDS, 0.1 mg/ml denatured DNA. The melting temperature is described by the 
15 following formula (Beltz, G.A., K.A. Jacobs, T.H. Eickbush, P.T. Cherbas, and F.C. Kafatos 

[1983] In Methods in Enzymology, R. Wu, L. Grossman and K. Moldave (eds.), Academic Press, 
New York. 100:266-285): 

Tm= 81.5°C + 16.6 Log[Na+] + 0.41(%G+C) -0.61(%formamide)- 600/length of duplex 

in base pairs. 

20 Washes were typically earned out as follows: 

(1) Twice at room temperature for 15 minutes in IX SSPE, 0.1% SDS (low stringency 

wash). 

(2) Once atTm -20°C for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate stringency 

wash). 

25 RFLP data was obtained for the ten strains most active on Agrotis ipsilon (Tables 3 and 

4). The hybridizing DNA bands described here contain all or part of the novel toxin genes under 
investigation. 
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Example 4 - DNA Sequencin g of Toxin Genes 

PCR-amplified segments of toxin genes present in B t. strains active on Agroiis ipsilon 

were sequenced. To accomplish this, amplified DNA fragments obtained using primers Forward 

3 (SEQ ID NO. 5) and Reverse 3 (SEQ ID NO. 6) were first cloned into the PCR DNA 
5 TA-cloning plasmid vector, pCRII, as described by the supplier (Invitrogen, San Diego, CA). 

Several individual pCRII clones from the mixture of amplified DNA fragments from each B.t. 
- strain were chosen for sequencing. Colonies were lysed by boiling to release crude plasmid 

DNA. DNA templates for automated sequencing were amplified by PCR using vector-specific 

primers flanking the plasmid multiple cloning sites. These DNA templates were sequenced 
10 using Applied Biosystems (Foster City, CA) automated sequencing methodologies. Toxin gene 

sequences and their corresponding nucleotide sequences, described below (SEQ ID NO. 7 

through SEQ ID NO. 62), were identified by this method. These sequences are listed in Table 

5. The polypeptide sequences deduced from these nucleotide sequences are also shown. 

From these partial gene sequences, seven oligonucleotides useful as PCR primers or 
1 5 hybndization probes were designed. The sequences of these oligonucleotides are the following: 

5 GTTCATTGGTATAAGAGTTGGTG 3' (SEQ ID NO. 63) 

5'CCACTGCAAGTCCGGACCAAATTCG 3' (SEQ ID NO. 64) 

5 'GAATATATTCCCGTC Y ATCTCTGG 3' (SEQ ID NO. 65) 

5 'GC ACGAATTACTGTAGCGATAGG 3' (SEQ ID NO. 66) 
20 5 'GCTGGTAACTTTGG AGATATGCGTG 3/ (SEQ ID NO. 67) 

5 'G ATTTCTTTGTAACACGTGGAGG 3' (SEQ ID NO. 68) 

5'CACTACTAATCAGAGCGATCTG 3' (SEQ ID NO. 69) 

Specific gene toxin sequences and the oligonucleotide probes that enable identification 

of these genes by hybndization, or by PCR in combination with the Reverse 3 primer described 
25 above, are listed in Table 5. 
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Table 5. Sequence ID reference numbers 


Strain 


Toxin 


Peptide 


Nucleotide 


Probe used 


PS11B 


11B1AR 


SEQ ID NO. 7 


SEQ ID NO. 8 






11B1BR 


SEQ ID NO. 9 


SEQ ID NO. 10 


SEQ ID NO. 65 


HD129 


1291A 


SEQ ID NO. 1 1 


SEQ ID NO. 12 


SEQ ID NO. 63 




1292A 


SEQ ID NO. 13 


SEQ ID NO. 14 


SEQ ID NO. 64 




1292B 


SEQ ID NO. 15 


SEQ ID NO. 16 




PS31G1 


31GA 


SEQ ID NO. 17 


SEQ ID NO. 18 


SEQ ID NO. 65 




31 GBR 


SEQ ID NO. 19 


SEQ ID NO. 20 


. 


PS185U2 


85N1R 


SEQ ID NO. 21 


SEQ ID NO. 22 






85N2 


SEQ ID NO. 23 


SEQ ID NO. 24 






85N3 


SEQ ID NO. 25 


SEQ ID NO. 26 


SEQ ID NO. 66 


PS86V1 


86V1C1 


SEQ ID NO. 27 


SEQ ID NO. 28 


SEQ ID NO. 68 




86V 1C2 


SEQ ID NO. 29 


SEQ ID NO. 30 


SEQ ID NO, 64 




86V1C3R 


SEQ ID NO. 31 


SEQ ID NO. 32 


SEQ ID NO. 69 


HD525 


F525A 


SEQ ID NO. 33 


SEQ ID NO. 34 


SEQ ID NO. 64 




F525B 


SEQ ID NO. 35 


SEQ ID NO. 36 


SEQ ID NO. 63 




F525C 


SEQ ID NO. 37 


SEQ ID NO. 38 




HD573 


F573A 


SEQ ID NO. 39 


SEQ ID NO. 40 


SEQ ID NO. 63 




F573B 


SEQ ID NO. 41 


SEQ ID NO. 42 


SEQ ID NO. 67 




F573C 


SEQ ID NO. 43 


SEQ ID NO. 44 


SEQ ID NO. 64 


PS86BB1 


FBB1A 


SEQ ID NO. 45 


SEQ ID NO. 46 


SEQ ID NO. 68 




FBB1BR 


SEQ ID NO. 47 


SEQ ID NO. 48 


SEQ ID NO. 69 




FBBIC 


SEQ ID NO. 49 


SEQ ID NO. 50 


SEQ ID NO. 64 




FBBID 


SEQ ID NO. 51 


SEQ ID NO. 52 


SEQ ID NO. 63 


PS89J3 


J31AR 


SEQ ID NO. 53 


SEQ ID NO. 54 


SEQ ID NO. 68 




J32AR 


SEQ ID NO. 55 


SEQ ID NO. 56 


SEQ ID NO. 64 


PS86W1 


W1FAR 


SEQ ID NO. 57 


SEQ ID NO. 58 


SEQ ID NO. 68 




W1FBR 


SEQ ID NO. 59 


SEQ ID NO. 60 


SEQ ID NO. 69 




W1FC 


SEOIDNO.61 


SEO ID NO. 62 


SEO ID NO. 64 



Example 5 - Isolation and DNA Sequencing of Full-Length Toxin Genes 

Total cellular DNA was extracted from B.t. strains using standard procedures known in 
the art. See, e.g., Example 3, above. Gene libraries of size-fractionated Sau3A partial 
restriction fragments of total cellular DNA were constructed in the bacteriophage vector, 
Lambda-Gemll. Recombinant phage were packaged and plated on £. coli KW251 cells. 
Plaques were screened by hybridization with radiolabeled gene-specific probes derived from 
DNA fragments PCR-amplified with oligonucleotide primers SEQ ID NOS. 5 and 6. 
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Hybridizing phage were plaque-purified and used to infect liquid cultures of E. coli KW251 
cells for isolation of DNA by standard procedures (Mamatis, T., E.F. Fritsch, J. Sambrook 
( 1 982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, NY). Toxin genes were subsequently subcloned into pBluescipt vectors (Stratagene) 
5 for DNA sequence analysis. 

The full-length toxin genes listed below were sequenced using Applied Biosystems 
(Foster City,CA) automated sequencing methodologies. The toxin gene sequences and the 
respective predicted polypeptide sequences are listed below. 



Source Strain 


Peptide SEQ ID 


Nucleotide SEQ ID 


Toxin designation 


PS86BB1 


SEQ ID NO. 70 


SEQ ID NO. 71 


86BBl(a) 


PS86BB1 


SEQ ID NO. 72 


SEQ ID NO. 73 


86BBl(b) 


PS3IG1 


SEQ ID NO. 74 


SEQ ID NO. 75 


31Gl(a) 



15 Recombinant E. coli NM522 strains containing these plasmids encoding these toxins were 

deposited with NRRL on June 27, 1997. 



Strain 


Plasmid 


Toxin designation 


NRRL number 


MR922 


pMYC2451 


86BBl(a) 


B-21794 


MR923 


pMYC2453 


86BBl(b) 


B-21795 


MR924 


pMYC2454 


31Gl(a) 


B-21796 



Exam ple 6 - Hefr rftl gpous Expression of No vel B.t. Toxins in Pseudomonas fluorescent iPf) 
25 Full-length toxin genes were engineered into piasmid vectors by standard DNA cloning 

methods, and transformed into Psuedomonas Jlourescens for expression. Recombinant bacterial 
strains (Table 6) were grown in shake flasks for production of toxin for expression and 
quantitative bioassay against a variety of lepidopteran insect pests. 



Table 6. Recombinant Pseudomonas fluoresces strains for heterologous expression of 






novel toxins 




Source Strain 


Plasmid 


Toxin 


Recombinant P.f. Strain 


PS86BB1 


pMYC2804 


86BBl(a) 


MR1259 


PS86BB1 


pMYC2805 


86BBl(b) 


MR 1260 


PS31G1 


DMYC2430 


31Gl(a) 


MR1264 
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Example 7 - Processine of Endotoxins with Trypsin 

Cultures of Pseudomonas fluoresces were grown for 48 hrs. as per standard procedures. 
Cell pellets were harvested by centrifugation and washed three times with water and stored at 
- 70°C. Endotoxin inclusions were isolated from cells treated with lysozyme and DNAse by 
5 differential centrifugation. Toxins isolated in this manner were then processed to limit peptides 

by trypsinolysis and were then used for bioassays on lepidopteran pests. 

Detailed protocols follow. Toxin inclusion bodies were prepared from the washed crude 
cell pellets as follows: 

4L of Lysis Buffer (prepare day of use) 



10 sm 

Tris base 24.22 

NaCl 46.75 

Glycerol 252 

Dithiothreitol 0.62 

15 EDTA Disodium salt 29.78 

Triton X- 100 20 mis 



Adjust pH to 7.5 with HC1 and bring up to final volume (4L.) with distilled water. 
1 . Thaw frozen cell pellet in 37°C water bath. 
20 2. Add the lysis buffer until the 500 ml polycarbonate centrifuge bottles are as full 

as possible -400 ml total volume. Disperse by inversion of the bottle or using 
the Polytron at low rpm. 

3. Centrifuge (10,000 x g) for 20 minutes at 4°C. 

4. Decant and discard supernatant. 

25 5. Resuspend pellet in 5 ml of lysis buffer for every gram of pellet, using the 

Polytron at low rpm to disperse the pellet. 

6. Add 25 mg/ml lysosyme solution to the suspension to a final concentration of 
0.6 mg/ml. 

7. Incubate at 37°C for 4 minutes. Invert every 30 seconds. 
30 8. Place suspension on ice for 1 hour. 

9. Add 2.5M MgCl-6H 2 0 to the tubes to a final concentration of 60 mM. Add a 
40 mg/ml deoxyribonuclease I (Sigma) solution to get a final concentration of 
0.5 mg/mi. 

10. Incubate overnight at 4°C. 
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1 1 . Homogenize the lysate using the Polytron at low rpm. 

12. Centrifuge at 10,000g at 4 °C for 20 minutes. Decant and discard supernatant. 

13. Resuspend the inclusion pellet in lysis buffer. Check microscopically for 
complete cell lysis. 

5 1 4. Wash the inclusion pellet in lysis buffer 5 times (repeat steps 2-5). 

15. Store as a suspension of 10 mM Tris-Cl pH 7.5, 0.1 mM PMSF and stored at 
-70°Cin 1.5 ml Eppitubes. 

Digestion of inclusions with trypsin is performed as follows: 
10 Digestion solution: 

1. 2 ml lMNaCAPSpH 10.5 

2. Inclusion preparation (as much as 100 mg protein) 

3 . Trypsin at a 1 : 1 00 ratio with the amount of protein to be cleaved (added during 
the procedure) 

1 5 4. H 2 0 to a final volume of 1 0 ml 



Trypsin treatment is performed as follows: 

1. Incubate the digestion solution, minus trypsin, at 37°C for 15 minutes. 

2. Add trypsin at 1 : 1 00 (trypsin :toxin protein wt/wt) 

20 3. Incubate solution for 2 hours at 37°C with occasional mixing by inversion. 

4. Centrifuge the digestion solution for 1 5 minutes at 15,000g at 4°C. 

5. Remove and save the supernatant. 

6. Supernatant is analyzed by SDS-PAGE and used for bioassay as discussed 
below. 

25 

Example 8 - Expression of a Gene from B.t. strain HD129 in a Chimeric Construct 

A gene was isolated from Ba. strain HD129. This gene appears to be a pseudogene with 
no obvious translational initiation codon. To express this gene from HD129, we designed and 
constructed a gene fusion with the first 28 codons of cry I Ac in Pseudomonas expression system. 
30 The nucleotide and peptide sequences of this chimeric toxin are shown in SEQ ID NOS. 76 and 

77. Upon induction, recombinant P. fluoresces containing this novel chimeric toxin expressed 
the polypeptide of the predicted size. 



Example 9 - Further Sequencing of Toxin Genes 



WO 98/00546 PCT/US97/U658 

37 

DNA of soluble toxins from the isolates listed in Table 7 were sequenced. The SEQ ID 
NOS. of the sequences thus obtained are also reported in Table 7. 



Table 7. 



5 


Source Isolate 


Protein SEQ ID NO. 


Nucleotide SEQ ID NO. 


Toxin Name 




PS11B 


78 


79 


HB(a) 




PS31G1 


80 


81 


31Gl(b) 




PS86BB1 


82 


83 


86BBl(c) 




PSR6V1 

r jou v i 


84 


85 


86Vl(a) 


10 


PS86W1 


86 


87 


86Wl(a) 




PS94R1 


88 


89 


94Rl(a) 




PS185U2 


90 


91 


185U2(a) 




PS202S 


92 


93 


202S(a) 




PS213E5 


94 


95 


213E5(a) 


15 


PS218G2 


96 


97 


218G2(a) 




HD29 


98 


99 


29HD(a) 




HD110 


100 


101 


110HD(a) 




HD129 


102 


103 


129HD(b) 




HD573 


104 


105 


573HD(a) 



20 



Example 10 - Black Cutworm Bioassav 

Suspensions of powders containing B.t. isolates were prepared by mixing an appropriate 
amount of powder with distilled water and agitating vigorously. Suspensions were mixed with 

25 black cutworm artificial diet (BioServ, Frenchtown, NJ) amended with 28 grams alfalfa powder 

(BioServ) and 1 .2 ml formalin per liter of finished diet. Suspensions were mixed with finished 
artificial diet at a rate of 3 ml suspension plus 27 ml diet. After vortexing, this mixture was 
poured into plastic trays with compartmentalized 3 ml wells (Nutrend Container Corporation, 
Jacksonville, FL). A water blank containing no B.t. served as the control. Early first-instar 

30 Agrotis ipsilon larvae (French Agricultural Services, Lamberton, MM) were placed singly onto 

the diet mixture. Wells were then sealed with "MYLAR" sheeting (ClearLam Packaging, IL) 
using a tacking iron, and several pinholes were made in each well to provide gas exchange. 
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Larvae were held at 29°C for four days m a 14: ] 0 (light.dark) holding room. Mortality was 
recorded after four days. 

The following B.i. isolates were found to have activity against black cutworm: 
PS 1 85U2, PS 1 1 B, PS2 1 8G2, PS2 1 3E5, PS86 W 1 , PS28C, PS86BB i , PS89J3 , PS86V 1 , PS94R 1 , 
5 HD525, HD573, PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, and PS31GL 

Bioassay results are shown in Table 8. 



Table 8. Percentage black cutworm mortality associated wither isolates 
Estimated toxin concentration (ug toxin/mL diet) 
10 Sample 200 100 50 25 



PS86BB1 


51 


25 


9 


1 


PS31G1 


30 


20 


7 


5 


PS11B 


37 


16 


3 


0 


HD573 


11 


13 


3 


0 


HD129 


87 


73 


43 


7 


PS86V1 


73 


29 


19 


3 


PS89J3 


68 


27 


15 


3 


PS86W1 


61 


23 


12 


15 


PS185U2 


69 


32 


14 


16 


HD525 


67 


20 


11 


4 


water control 


1 









Example 1 1 - A ctivity of B.i Isolates Against Aerotis ipsilon 

Strains were tested as supernatant cultures. Samples were applied to black cutworm 

25 artificial diet (BioServ, Frenchtown, NJ) and allowed to air dry before larval infestation. A 

water blank containing no B.t. served as the control. Eggs were applied to each treated well and 
were then sealed with "MYLAR" sheeting (ClearLam Packaging, IL) using a taclcng iron, and 
several pinholes were made in each well to provide gas exchange. Bioassays were held at 25 °C 
for 7 days in a 14:10 (lightrdark) holding room. Mortality was recorded after seven days. 

30 Strains exhibiting mortality against A. ipsilon (greater than water control) are reported in Table 

9. 
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Table 9. Larvacidal activity of B.t. concentrated supernatants in a top load bioassay on/*. 

ipsilon neonates 



Strain Activity 



PS86W1 ' 




PS28C 




PS86BB1 


+ 


PS89J3 


+ 


PS86V1 


+ 


PS94R1 


+ 


HD573 


+ 



Example 12 - Activity of B.t. Isolates Pseudomonas fluorescens Clones Against Heliothis 
virescens (Fzbncms) and H elicoverpa zea (Boddie) 

Strains were tested as either frozen Pseudomonas fluorescens clones or B.t supernatant 
culture samples. Suspensions of clones were prepared by individually mixing samples with 
distilled water and agitating vigorously. For diet incorporation bioassays, suspensions were 
mixed with the artificial diet at a rate of 6 mL suspension plus 54 mL diet. After vortexing, this 
mixture was poured into plastic trays with compartmentalized 3-ml wells (Nutrend Container 
Corporation, Jacksonville, FL). Supernatant samples were mixed at a rate of 3-6 ml with the 
diet as outlined above. In top load bioassays, suspensions or supematants were applied to the 
top of the artificial diet and allowed to air dry before larval infestataion. A water blank served 
as the control. First instar larvae (USDA-ARS, Stoneville, MS) were placed singly onto the diet 
mixture. Wells were then sealed with "MYLAR" sheeting (ClearLam Packaging) using a 
tacking iron, and several pinholes were made in each well to provide gas exchange. Larvae were 
held at 25 °C for 6 days in a 14:10 (Iight:dark) holding room. Mortality was recorded after six 
days. 

Results are as follows: 
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Table 10. Larvacidal activity of B.t. concentrated supernatants in a top load bioassay 




Total Protein 


H. virescens 


H.zea 




Strain 


(ug/cm 2 ) % Mortality 


Stunting 


% Mortality 


Stunting 


HD129 


44.4 


100 


yes 


50 


yes 




44.4 


81 


yes 


50 


yes 




47.6 


100 


ves 


36 


no 


Jul OJ w 


-23.4 


100 


yes 


100 


ves 




23.4 


100 


yes 


95 


yes 




21.2 


100 


yes 


96 


yes 




21.2 






100 


yes 


PS31G1 


8.3 


70 


yes 


39 


yes 




8.3 


17 


yes 


30 


yes 




3.6 


29 


yes 


30 


yes 




3.6 






0 


no 




Table 11. Strains tested in diet incorporation bioassay on H. virescens and H, zea 




H. virescens 




H. zea 




Strain 


Total protein 
(ug/ml diet) 


% Mortality 


Total protein 
(ng/ml diet) 


% Mortality 


PS11B 


NA' 




45 


268 


96 


PS185U2 


55 




100 


55 


100 


PS31G1 


0 




50 


43.4 


13 


PS86BB1 


23.3 




100 


23.3 


100 


PS86V1 


17 




100 


17 


92 


PS86W1 


18 




100 


18 


83 


PS89J3 


13 




100 


13 


81 


HD129 


NA 




100 


138.3 


13 


HD525 


3 




96 


171.7 


0 


HD573A 


3 




96 


78.3 


21 



Protein information not available. 
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Table 12. H. virescens dose response in diet incorporation bioassays using frozen spore 

crystal preparations 



MR# 


LC50 (ng/ml) 


1259 


13.461 


1259 trypsin 


1.974 


1260 


12.688 


1260 trypsin 


0.260 


1264 


95.0 


1264trvpsin 


2.823 



Exam ple 13 - Insertion of Toxin Genes Into Plants 

15 One aspect of the subject invention is the transformation of plants with genes encoding 

the insecticidal toxin. The transformed plants are resistant to attack by the target pest. 

Genes encoding pesticidal toxins, as disclosed herein, can be inserted into plant cells 
using a variety of techniques which are well known in the art. For example, a large number of 
cloning vectors comprising a replication system in E. coli and a marker that permits selection 

20 of the transformed cells are available for preparation for the insertion of foreign genes into 

higher plants. The vectors comprise, for example, pBR322, pUC series, M13mp series, 
pACYC184, etc. Accordingly, the sequence encoding the B.i toxin can be inserted into the 
vector at a suitable restriction site. The resulting plasmid is used for transformation into E. coli. 
The E. coli cells are cultivated in a suitable nutrient medium, then harvested and lysed. The 

25 plasmid is recovered. Sequence analysis, restriction analysis, electrophoresis, and other 

biochemical-molecular biological methods are generally carried out as methods of analysis. 
After each manipulation, the DNA sequence used can be cleaved and joined to the next DNA 
sequence. Each plasmid sequence can be cloned in the same or other plasmids. Depending on 
the method of inserting desired genes into the plant, other DNA sequences may be necessary. 

30 If, for example, the T] or Rj plasmid is used for the transformation of the plant cell, then at least 

the right border, but often the right and the left border of the Ti or Ri plasmid T-DNA, has to 
be joined as the flanking region of the genes to be inserted. 

The use of T-DNA for the transformation of plant cells has been intensively researched 
and sufficiently described in EP 120 516; Hoekema (1985) In: The Binary Plant Vector System, 
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Offset-durkkerij Kanters B.V., Alblasserdam, Chapter 5, Fraley et ai, Crit. Rev. Plant Sci. 4: 1- 
46; and An et ai (1985) EMBOJ. 4:277-287. 

Once the inserted DNA has been integrated in the genome, it is relatively stable there 
and, as a rule, does not come out again. It normally contains a selection marker that confers on 
5 the transformed plant cells resistance to a biocide or an antibiotic, such as kanamycin, G 4 1 8, 

bleomycin, hygromycin, or chloramphenicol, inter alia. The individually employed marker 
should accordingly permit the selection of transformed cells rather than cells that do not contain 

the inserted DNA. . 

A large number of techniques are available for inserting DNA into a plant host cell. 

10 Those techniques include transformation with T-DNA using Agrobacterium tumefaciens or 

Agrobacterium rhizogenes as transformation agent, fusion, injection, biolistics (microparticle 
bombardment), or electroporation as well as other possible methods. If Agrobacteria are used 
for the transformation, the DNA to be inserted has to be cloned into special plasmids, namely 
either into an intermediate vector or into a binary vector. The intermediate vectors can be 

15 integrated into the Ti or Ri plasmid by homologous recombination owing to sequences that are 

homologous to sequences in the T-DNA. The Ti or Ri plasmid also comprises the vir region 
necessary for the transfer of the T-DNA. Intermediate vectors cannot replicate themselves in 
Agrobacteria. The intermediate vector can be transferred into Agrobacterium tumefaciens by 
means of a helper plasmid (conjugation). Binary vectors can replicate themselves both in E. coli 

20 and in Agrobacteria. They comprise a selection marker gene and a linker or polylinker which 

are framed by the right and left T-DNA border regions. They can be transformed directly into 
Agrobacteria (Holsters et ai [1978] Moi Gen. Genet. 163:181-187). The Agrobacterium used 
as host cell is to comprise a plasmid carrying a vir region. The vir region is necessary for the 
transfer of the T-DNA into the plant cell. Additional T-DNA may be contained. The bacterium 

25 so transformed is used for the transformation of plant cells. Plant explants can advantageously 

be cultivated with Agrobacterium tumefaciens ox Agrobacterium rhizogenes for the transfer of 
the DNA into the plant cell. Whole plants can then be regenerated from the infected plant 
material (for example, pieces of leaf, segments of stalk, roots, but also protoplasts or suspension- 
cultivated cells) in a suitable medium, which may contain antibiotics or biocides for selection. 

30 The plants so obtained can then be tested for the presence of the inserted DNA. No special 

demands are made of the plasmids in the case of injection and electroporation. It is possible to 
use ordinary plasmids, such as, for example, pUC derivatives. 

The transformed cells grow inside the plants in the usual manner. They can form germ 
cells and transmit the transformed trait(s) to progeny plants. Such plants can be grown in the 



10 
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normal manner and crossed with plants that have the same transformed hereditary factors or 
other hereditary factors. The resulting hybrid individuals have the corresponding phenotypic 
properties. 

In a preferred embodiment of the subject invention, plants will be transformed with 
genes wherein the codon usage has been optimized for plants. See, for example, U.S. Patent No. 
5,380,831 , which is hereby incorporated by reference. Also, advantageously, plants encoding 
a truncated toxin will be used. The truncated toxin typically will encode about 55% to about 
80% of the full length toxin. Methods for creating synthetic B.t. genes for use in plants are- 
known in the art. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of this 
application and the scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION : 



(i) APPLICANT: 



Applicant Name (s) : 
Street address: 
City : 
State/Province : 
Country : 
- Postal code/Zip: 
Phone number: 



MYCOGEN CORPORATION 
5501 Oberlin Drive 
San Diego 
Cal if ornia 
US 

92121 

(619) 453-8030 Fax number: 



(619) 453-6991 



(ii) TITLE OF INVENTION: Toxins Active Against Pests 



(iii) NUMBER OF SEQUENCES : 105 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Saliwanchik, Lloyd & Saliwanchik 

(B) STREET: 2421 N.W. 41st Street, Suite A-l 

(C) CITY; Gainesville 

(D) STATE: Florida 

(E) COUNTRY: USA 

(F) ZIP: 32606 



(v) COMPUTER READABLE FORM : 

(A) MEDIUM TYPE : Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : US 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/674,002 

(B) FILING DATE: 01-JUL-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Sanders, Jay M. 

(B) REGISTRATION NUMBER: 39,355 

<C) REFERENCE/DOCKET NUMBER: MA-701C1 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (352) 375-8100 

(B) TELEFAX: (352) 372-5800 



(2) INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CGTGGCTATA TCCTTCGTGT YAC 23 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2: 
ACRATRAATG TTCCTTCYGT TTC 23 
(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : 3 : 
GGATATGTMT TACGTGTAAC WGC 23 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



CTACACTTTC TATRTTGAAT RYACCTTC 



28 
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(2) INFORMATION FOR SCO ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : 5 : 
CCAGWTTTAY AGGAGG 16 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GTAAACAAGC TCGCCACCGC 2 0 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Ser Pro Xaa Gin 
15 10 15 

He Ser Xaa Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg Val Arg He Xaa Xaa Ala Ser Thr Thr Xaa Xaa Gin Phe His Thr 
35 40 45 



Ser He Xaa Gly Arg Pro He Asn Gin Gly Asn Phe Ser Xaa Thr Met 
50 55 60 
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Ser Ser Gly Ser Asn Leu Gin Ser Gly Xaa Phe Arg Thr Val Gly Phe 
65 70 75 BO 

Thr Thr Pro Xaa Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 95 

Xaa His Val Phe Asn Ser Gly Asn Glu Val Tyr lie Asp Arg lie Glu 
100 105 110 

Phe Val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Xaa Lys~Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO : B : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

CCAGGATTTA YAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGKS CAGAT TTCAWCCTTA .60 

AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCR CWACGCTTCT 12 0 

ACYACAWATT TWCAATTCCA TACATCAATT GRCGGAAGAC CTATTAATCA GGGKAATTTT 18 0 

TCASCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA KCTTTAGGAC TGTAGGTTTT 24 0 

ACTACTCCGT KTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTKC TCATGTCTTC 30 0 

AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 3 60 

GAGGCAGAAT ATGATTTAGA AAGAGCACMA AAGGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 9 : 
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Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg Thr Asp Gly Gly Xaa 
15 10 15 

Val Gly Thr lie Arg Ala Asn Val Asn Ala Pro Leu Thr Gin Gin Tyr 
20 25 30 

Arg lie Arg Leu Arg Tyr Ala Ser Thr Thr Ser Phe Val Val Asn Leu 
35 40 45 

Phe Val Asn Asn Ser Ala Ala Gly Phe Thr Leu Pro Ser Thr Met Ala 
50 55 60 

Gin Asn Gly Ser Leu Thr Xaa Glu Ser Phe Asn Thr Leu Glu Val Thr 
65 70 75 80 

His Xaa lie Arg Phe Ser Gin Ser Asp Thr Thr Leu Arg Leu Asn He 
85 90 95 

Phe Pro Ser He Ser Gly Gin Xaa Val Tyr Val Asp Lys Xaa Glu He 
100 105 110 

Val Pro Xaa Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Asp Xaa 
115 120 125 



Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10; 

CCAGGWTTTA CAGGAGGGGA TATACTTCGA AGAACGGaCG GTGGTRCAGT TGGAACGATT 60 

AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 120 

ACAACAAGTT TTGTTGTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 180 

AGTACAATGG CTCAAAATGG TTCTTTAACA YRCGAGTCGT TTAATACCTT AGAGGTAACT 24 0 

CATWCTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCYATC 3 00 

TCTGGTCAAG RAGTGTATGT AGATAAACWT GAAATCGTTC CAWTTAACCC GACACGAGAA 360 

GCGGAAGAAG ATTTAGAAGA TSCAAAGAAA GCGGTGGCGA GCTTGTTTAC 410 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 
1 5 10 15 

Phe Gly Thr He Arg Val Arg Xaa Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30- 

Arg He Arg Phe Arg Phe Ala Xaa Thr Thr Asn Leu Phe He Gly He 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCAGGTTTTA YAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60 
AGGGTAAGGA YTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTYT 12 0 



WO 98/00546 PCT/US97/1 1 658 

50 

ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 24 0 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 3 00 

AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 3 60 

GAGGCGAAAG AGGATYTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser Leu 
1 5 10 15 

Gly Val Leu Arg Va.l Gin Phe Pro Leu His Leu Arg Gin Gin Tyr Arg 
20 25 30 

He Xaa Val Arg Tyr Ala Xaa Thr Thr Asn He Arg Leu Ser Val Asn 
35 40 45 

Gly Ser Phe Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg Leu 
50 55 60 

Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn Thr 
65 70 75 80 

Ser lie Arg Pro Thr Ala Ser Pro Asp Gin lie Arg Leu Thr He Glu 
85 90 95 

Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg lie Glu Phe He 
100 105 110 

Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala Lys 
115 120 125 

Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GGMTTTATAG GAGGAGCTCT ACTTCAAAGG ACTGACCATG GTTCGCTTGG AGTATTGAGG 60 

GTCCAATTTC CACTTCACTT AAGACAACAA TATCGTATTA SAGTCCGTTA TGCTTYTACA 12 0 

ACAAATATTC GATTGAGTGT GAATGGCAGT TTCGGTACTA TTTCTCAAAA TCTCCCTAGT 180 

ACAATGAGAT TAGGAGAGGA TTTAAGATAC GGATCTTTTG CTATAAGAGA GTTTAATACT 24 0 

TCTATTAGAC CCACTGCAAG TCCGGACCAA ATTCGATTGA CAATAGAACC ATCTTTTATT 3 00 

AGACAAGAGG TCTATGTAGA TAGAATTGAG TTCATTCCAG TTAATCCGAC GCGAGAGGCG 36 0 

AAAGAGGATC TAGAAGCAGC AAAAAAAGCG GTGGCGAGCT TGTTTAC 4 07 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:15: 

Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gin 
1 5 10 15 

He Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg Val Arg lie Arg Tyr Ala Ser Thr Thr Asn Leu Gin Phe His Thr 
35 40 45 

Ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 
50 55 60 

Ser Ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 95 



Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 
100 105 110 
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Phe Val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Gln.Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO : 16 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid" 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



CCAGGATTTA 


CAGGAGGAGA 


TATTCTTCGA AGAACTTCAC 


CTGGCCAGAT 


TTCAACCTTA 


60 


AGAGTAAATA 


TTACTGCACC 


ATTATCACAA AGATATCGGG 


TAAGAATTCG 


CTACGCTTCT 


120 


ACCACAAATT 


TACAATTCCA 


TACATCAATT GACGGAAGAC 


CTATTAATCA 


GGGGAATTTT 


180 


TCAGCAACTA 


TGAGTAGTGG 


GAGTAATTTA CAGTCCGGAA 


GCTTTAGGAC 


TGTAGGTTTT 


240 


ACTACTCCGT 


TTAACTTTTC 


AAATGGATCA AGTGTATTTA 


CGTTAAGTGC 


TCATGTCTTC 


300 


AATTCAGGCA 


ATGAAGTTTA 


TATAGATCGA ATTGAATTTG 


TTCCGGCAGA 


AGTAACCTTT 


360 


GAGGCAGAAT 


ATGATTTAGA 


AAGAGCGCAA AAGGCGGTGG 


CGAGCTTGTT 


TAC 


413 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Asp Gly Gly Ala 
15 10 15 

Val Gly Thr He Arg Ala Asn Val Asn Ala Pro Leu Thr Gin Gin Tyr 
20 25 30 

Arg He Arg Leu Arg Tyr Ala Ser Thr Thr Ser Phe Val Val Asn Leu 
35 40 45 
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Phe Val Asn Asn Ser Ala Ala Gly Phe Thr Leu Pro Ser Thr Met 
50 55 60 



Ala 



Gin Asn Gly Ser Leu Thr Tyr Glu Ser Phe Asn Thr Leu Giu Val 
65 70 75 



Thr 
80 



His Thr He Arg Phe Ser Gin Ser Asp Thr Thr Leu Arg Leu Asn 
85 90 95 



lie 



Phe Pro Ser He Ser Gly Gin Glu Val Tyr Val Asp Lys Leu Glu 
1O0 105 110 



He 



Val Pro He Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Asp Ala 
115 120 125 



Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID N0:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CCAGGWTTTA YAGGAGGGGA TATACTTCGA AGAACGGACG GTGGTGCAGT TGGAACGATT 60 

AGAGCTAATG TTAATGCCCC ATTAACACAA CAATATCGTA TAAGATTACG CTATGCTTCG 12 0 

ACAACAAGTT TTGTTGTTAA TTTATTTGTT AATAATAGTG CGGCTGGCTT TACTTTACCG 180 

AGTACAATGG CTCAAAATGG TTCTTTAACA TACGAGTCGT TTAATACCTT AGAGGTAACT 24 0 

CATACTATTA GATTTTCACA GTCAGATACT ACACTTAGGT TGAATATATT CCCGTCTATC 3 00 

TCTGGTCAAG AAGTGTATGT AGATAAACTT GAAATCGTTC CAATTAACCC GACACGAGAA 3 60 

GCGGAAGAAG ATTTAGAAGA TGCAAAGAAA GCGGTGGCGA GCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 19: 

Pro Gly Phe Xaa Gly Gly Asp He Leu Arg Arg Thr Ser Pro Gly Gin 
15 10 15 

lie Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 3 0 

Arg Val Arg He Arg Tyr Ala Xaa Thr Thr Asn Leu Gin Phe His Thr 
35 ' 40 45 

Ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 
50 55 60 

Ser Ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 95 

Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr lie Asp Arg He Glu 
100 105 no 

Phe Val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CCAGGWTTTA YAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA 60 

AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTYT 12 0 

ACYACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGKAATTTT 180 

TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 24 0 

ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 300 

AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 3 60 

GAGGCAGAAT ATGATTTAGA AAGAGCACAA AAGGCGGTGG CGAGCTTGTT TAC 413 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:_21j 

Phe Thr Gly Gly Asp lie Leu Arg Arg Asn Thr lie Gly Glu Phe Val 
15 10 15 

Ser Leu Gin Val Asn lie Asn Ser Pro He Thr Gin Arg Tyr Arg Leu 
20 25 30 

Arg Phe Arg Tyr Ala Ser Ser Arg Asp Ala Arg He Thr Val Ala He 
35 40 45 

Gly Gly Gin He Arg Val Asp Met Thr Leu Glu Lys Thr Met Glu lie 
50 55 60 

Gly Glu Ser Leu Thr Xaa Arg Thr Phe Ser Tyr Thr Asn Phe Ser Asn 
65 70 75 80 

Pro Phe Ser Phe Arg Ala Asn Pro Asp He He Arg lie Ala Glu Glu 
B5 90 95 

Leu Pro He Arg Gly Gly Glu Leu Val Tyr 
100 105 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TTTACAGGAG GGGATATCCT TCGAAGAAAT ACCATTGGTG AGTTTGTGTC TTTACAAGTC 60 
AATATTAACT CACCAATTAC CCAAAGATAC CGTTTAAGAT TTCGTTATGC TTCCAGTAGG 12 0 

GATGCACGAA TTACTGTAGC GATAGGAGGA CAAATTAGAG TAGATATGAC CCTTGAAAAA 180 



ACCATGGAAA TTGGGGAGAG CTTAACATYT AGAACATTTA GCTATACCAA TTTTAGTAAT 



240 



WO 98/00546 PCT/US97/1 1658 

56 

CCTTTTTCAT TTAGGGCTAA TCCAGATATA ATTAGAATAG CTGAAGAACT TCCTATTCGC 3 00 

GGTGGCG AG C TTGTTTAC 318 



(2) INFORMATION FOR SEQ ID NO:23; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS-: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

lie Pro Leu Val Ser Leu Cys Leu Tyr Lys Ser lie Leu Thr His Gin 
1 5 10 15 

Leu Pro Lys Asp Thr Val Xaa Xaa Phe Val Met Leu Pro Val Gly Met 
20 25 30 

His Glu Leu Leu Xaa Arg Xaa Glu Asp Lys Leu Glu Xaa lie Xaa Pro 
35 40 45 

Leu Lys Lys Pro Trp Lys Leu Gly Arg Ala Xaa His Leu Glu His Leu 
50 55 60 

Ala He Pro He Leu Val He Leu Phe His Leu Gly Leu He Gin He 
65 70 75 80 

Xaa Leu Glu Xaa Leu Lys Asn Phe Leu Phe Ala Val Ala Ser Leu Phe 
85 90 95 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

AAATACCATT GGTGAGTTTG TGTCTTTACA AGTCAATATT AACTCACCAA TTACCCAAAG 60 

ATACCGTTTA ARATTTCGTT ATGCTTCCAG TAGGGATGCA CGAATTACTG TAGCGATAGG 12 0 

AGGACAAATT AGAGTAGATA TGACCCTTGA AAAAACCATG GAAATTGGGG AGAGCTTAAC 18 0 

ATCTAGAACA TTTAGCTATA CCAATTTTAG TAATCCTTTT TCATTTAGGG CTAATCCAGA 24 0 
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TATAATTAGA ATAGCTGAAG AACTTCCTAT TCGCGGTGGC GAGCTTGTTT AC 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10B amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

" (ii) MOLECULE TYPE : protein, 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Pro Gly Phe Xaa Gly Gly Asp lie Leu Arg Arg Asn Thr lie Gly Glu 
15 10 15 

Phe Val Ser Leu Gin Val Asn lie Asn Ser Pro lie Thr Gin Arg Tyr 

20 25 30 

Arg Leu Arg Phe Arg Tyr Ala Ser Ser Arg Asp Ala Arg He Thr Val 
35 40 45 

Ala He Gly Gly Gin He Arg Val Xaa Met Thr Leu Glu Lys Thr Met 
50 55 60 

Glu He Gly Glu Ser Leu Thr Ser Arg Thr Phe Ser Tyr Thr Asn Phe 
65 70 75 80 

Ser Asn Pro Phe Ser Phe Arg Ala Asn Pro Asp He lie Arg He Ala 
85 90 95 

Glu Glu Leu Pro He Arg Gly Gly Glu Leu Val Tyr 
100 105 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CCAGGWTTTA YAGGAGGGGA TATCCTTCGA AGAAATACCA TTGGTGAGTT TGTGTCTTTA 60 
CAAGTCAATA TTAACTCACC AATTACCCAA AGATACCGTT TAAGATTTCG TTATGCTTCC 120 
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292 



AGTAGGGATG CACGAATTAC TGTAGCGATA GG AGO AC AAA TTAGAGTAKA TATGACCCTT 



180 
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GAAAAAACCA TGGAAATTGG GG AG AG CTTA ACATCTAGAA CATTTAGCTA TACCAATTTT 24 0 

AGTAATCCTT TTTCATTTAG GGCTAATCCA GATATAATTA GAATAGCTGA AGAACTTCCT 3 00 

ATTCGCGGTG GCGAGCTTGT TTAC 32 4 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Gly Phe Xaa Gly Gly Asp Val lie Arg Arg Thr Asn Thr Gly Gly Phe 
15 10 15 

Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr Arg 
20 25 30 

lie Arg Phe Arg Tyr Ala Ser Thr lie Asp Phe Asp Phe Phe Val Thr 
35 40 45 

Arg Gly Gly Thr Thr lie Asn Asn Phe Arg Phe Thr Arg Thr Met Asn 
50 55 60 

Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe Thr 
65 70 75 80 

Thr Pro Phe Asn .Phe Thr Gin Ser Gin Asp lie He Arg Thr Xaa He 
85 90 95 

Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu He 
100 105 110 

lie Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 

AGGATTTAYA GGAGGAGATG TAATCCGAAG AACAAATACT GGTGGATTCG GAGCAATAAG 60 

GGTGTCGGTC ACTGGACCGC TAACACAACG ATATCGCATA AGGTTCCGTT ATGCTTCGAC 12 0 

AATAGATTTT GATTTCTTTG TAACACGTGG AGGAACTACT ATAAATAATT TTAGATTTAC 180 

ACGTACAATG AACAGGGGAC AGGAATCAAG ATATGAATCC TATCGTACTG TAGAGTTTAC 24 0 

AACTCCTTTT AACTTTACAC AAAGTCAAGA TATAATTCGA ACAYCTATCC AGGGACTTAG 300 

TGGAAATGGG GAAGTATACC TTGATAGAAT TGAAATCATC CCTGTAAATC CAACACGAGA 360 

AGCGGAAGAR GATTTAGAAG CGGCGAAGAA AGCGGTGGCG AGCTTGTTTA C 411 

.(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 29: 

Pro Gly Phe'Ile Gly Gly Ala Leu Leu Gin Arg Thr A9p His Gly Ser 
1 ' 5 10 IS 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu HiB Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe lie Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 
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Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

fii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 12 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 3 00 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 3 60 

GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

Pro Gly Phe Xaa Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 
15 10 15 

Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 
2 0 25 30 

Arg Val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 
35 40 45 

Leu Arg' Gly Asn Thr Ser He Ala Tyr Gin Arg Phe Gly Ser Thr Met 
50 55 60 
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Asn Arg 
65 



Gly Gin Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 
70 75 80 



Thr Thr 



Asn Gin Ser Asp Leu Pro Phe Thr Phe Thr Gin Ala Gin Glu 
85 90 95 



Asn Leu 



Thr lie Leu Ala Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 
100 105 110 



lie Asp 



Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 
115 120 -125 



Glu Asp 
130 



Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 
135 140 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CCAGGWTTTA YAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60 

AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 12 0 

TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 18 0 

GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 24 0 

ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 3 00 

CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 360 

GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCRG CGAAGAAAGC GGTGGCGAGC 42 0 

TTGTTTAC 42 8 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn lie Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 - 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg lie Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLrOGY: linear 

Ui) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 
CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 6 0 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 12 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 3 00 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 3 60 



GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 



410 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg Thr Gly Val Gly Thr 
15 10 15 

Phe Gly Thr lie Arg Val Arg Thr Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Phe Ala Ser Thr Thr Asn Leu Phe lie Gly He 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 

Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr Phe Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Ala Arg Glu Ala Lys Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60 
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AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120 

ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 24 0 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 3 00 

AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360 

GAGGCGAAAG AGGATCTAGA _AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413 

{2} INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg Thr Ser Pro Gly Gin 
1 5 10 15 

lie Ser Thr Leu Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 

Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Gin Phe His Thr 
35 40 45 

Ser He Asp Gly Arg Pro He Asn Gin Gly Asn Phe Ser Ala Thr Met 
50 55 60 

Ser Ser Gly Ser Asn Leu Gin Ser Gly Ser Phe Arg Thr Val Gly Phe 
65 J 70 75 80 

Thr Thr Pro Phe Asn Phe Ser Asn Gly Ser Ser Val Phe Thr Leu Ser 
85 90 9b 

Ala His Val Phe Asn Ser Gly Asn Glu Val Tyr He Asp Arg He Glu 
100 105 110 

Phe Val Pro Ala Glu Val Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg 
115 120 125 

Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO:38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

< i i ) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 38: 
CCAGGWTTTA- CAGGAGGAGA TATTCTTCGA AGAACTTCAC CTGGCCAGAT TTCAACCTTA— - 6 0 

AGAGTAAATA TTACTGCACC ATTATCACAA AGATATCGGG TAAGAATTCG CTACGCTTCT 12 0 

ACCACAAATT TACAATTCCA TACATCAATT GACGGAAGAC CTATTAATCA GGGGAATTTT 180 

TCAGCAACTA TGAGTAGTGG GAGTAATTTA CAGTCCGGAA GCTTTAGGAC TGTAGGTTTT 24 0 

ACTACTCCGT TTAACTTTTC AAATGGATCA AGTGTATTTA CGTTAAGTGC TCATGTCTTC 300 

AATTCAGGCA ATGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACCTTT 360 

GAGGCAGAAT ATGATTTAGA AAGAGCACAR AAGGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 137 amino acids 
{B ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID. NO: 39: 

Pro Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Gly Val Gly Thr 
15 10 15 

Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Phe Ala Ser Thr Thr Asn Leu Phe lie Gly lie 
35 40 45 

Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp Phe Gly Arg Thr Met 
50 55 60 

Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe Ala Thr Arg Glu Phe 
65 70 75 80 

Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu Leu He Ser Val Phe 
85 90 95 



WO 98/00546 



PCT/US97/11658 



Ala Asn Ala Phe Ser Ala Gly Gin 
100 



Glu Val Tyr Phe Asp Arg lie Glu 
10S no 



lie lie Pro Val Asn Pro Ala Arg 
115 120 



Glu Ala Lys Glu Asp Leu Glu Ala 
125 



Ala Lys Lys Ala Val Ala Ser Leu 
130 135 



Phe 



(2) INFORMATION FOR SEQ ID NO: 40: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 0 : 

CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA €0 

AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120 

ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 24 0 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 300 

AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 360 

GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Pro Gly Phe Thr Gly Gly Aep He Leu Arg Arg Thr Asn Ala Gly Asn 
15 10 15 



Phe Gly Asp Met Arg Val Asn He Thr Ala Pro Leu Ser Gin Arg Tyr 
20 25 30 
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Arg Val Arg lie Arg Tyr Ala Ser Thr Ala Asn Leu Gin Phe His Thr 
35 40 45 

Ser lie Asn Gly Arg Ala lie Asn Gin Ala Asn Phe Pro Ala Thr Met 
50 55 60 

Asn Ser Gly Glu Asn Leu Gin Ser Gly Ser Phe Arg Val Ala Gly Phe 
65 70 75 80 

Thr Thr Pro Phe Thr Phe Ser Asp Ala Leu Ser Thr Phe Thr lie Gly 
85 90 95 

Ala Phe Ser Phe Ser Ser Asn Asn Glu Val Tyr lie Asp Arg lie Glu 
100 105 . 110 

Phe Val Pro Ala Glu Val Thr Phe Ala Thr Glu Ser Asp Gin Asp Arg 
115 120 125 



Ala Gin Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CCAGGWTTTA CAGGAGGGGA TATCCTTCGA AGAACGAATG CTGGTAACTT TGGAGATATG 60 

CGTGTAAACA TTACTGCACC ACTATCACAA AGATATCGCG TAAGGATTCG TTATGCTTCT 12 0 

ACTGCAAATT TACAATTCCA TACATCAATT AACGGAAGAG CCATTAATCA GGCGAATTTC 180 

CCAGCAACTA TGAACAGTGG GGAGAATTTA CAGTCCGGAA GCTTCAGGGT TGCAGGTTTT 24 0 

ACTACTCCAT TTACCTTTTC AGATGCACTA AG C AC ATT C A CAATAGGTGC TTTTAGCTTC 300 

TCTTCAAACA ACGAAGTTTA TATAGATCGA ATTGAATTTG TTCCGGCAGA AGTAACATTT 360 

GCAACAGAAT CTGATCAGGA TAGAGCACAA AAGGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : protein 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser-Thr Thr Asn lie Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala lie Arg Glu Phe Asn 
65 70 75 80 

Thr Ser lie Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

lie Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Xaa Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:44: 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 12 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 18 0 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 



ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT -30 0 
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ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 3 60 

GCGAAAGAGG ATCTAKAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

.(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 5 : 

Gin Xaa Leu Ser Gly Gly Asp Val lie Arg Arg Thr Asn Thr Gly Gly 
15 10 15 

Phe Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg He Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp He lie Arg Thr Ser 
85 90 95 

He Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 110 

He He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

CCAGGWTTTA tCAGGAGGAG ATGTAATCCG AAGAACAAAT ACTGGTGGAT TCGGAGCAAT 6 0 

AAGGGTGTCG GTCACTGGAC CGCTAACACA ACGATATCGC ATAAGGTTCC GTTATGCTTC 12 0 

GACAATAGAT TTTGATTTCT TTGTAACACG TGGAGGAACT ACTATAAATA ATTTTAGATT 180 

TACACGTACA ATGAACAGGG GACAGGAATC AAGATATGAA TCCTATCGTA CTGTAGAGTT 24 0 

TACAACTCCT TTTAACTTTA CACAAAGTCA AGATATAATT CGAACATCTA TCCAGGGACT 3 00 

TAGTGGAAAT GGGGAAGTAT ACCTTGATAG AATTGAAATC ATCCCTGTAA ATCCAACACG 360 

AGAAGCGGAA GARGATTTAG AAGCGGCG AA GAAAGCGGTG GCGAGCTTGT TTAC 414 

(2) INFORMATION FOR SEQ ID NO:4 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4 7: 

Pro Gly Phe Thr Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 
15 10 15 

Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg Val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 
35 40 45 

Leu Arg Gly Asn Thr Ser He Ala Tyr Gin Arg Phe Gly Ser Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 
65 70 75 80 

Thr Thr Asn Gin Ser Asp Leu Pro Phe Thr Phe Thr Gin Ala Gin Glu 
85 90 95 

Asn Leu Thr He Leu Ala Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 
100 105 HO 

He Asp Arg lie Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 
115 120 125 

Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO:4B: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48: 

CCAGGWTTTA CAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 6 0 

AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 12 0 

TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 18 0 

GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 24 0 

ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 3 00 

CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 3 60 

GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 42 0 

TTGTTTAC 4 2 8 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 136 amino acids 
IB)" TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
IS 10 IS 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn lie Arg Leu Ser Val 
35 40 45 



Asn Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 
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Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe lie Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 

130 135 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 

CCAGGWTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 6 0 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 12 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 180 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 3 00 
ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG ' 360 

GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
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Pro Gly Phe Thr 
1 

Phe Gly Thr lie 
20 

Arg lie Arg Phe 
35 

Arg Val Gly Asp 
50 

AsrTATrg Gly Asp 
65 

Thr Thr Asp Phe 



Ala Asn Ala Phe 
100 

lie lie Pro Val 
115 

Ala Lys Lys Ala 
130 



Gly Gly Asp lie 
5 

Arg Val Arg Thr 



Arg Phe Ala Ser 
40 

Arg Gin Val Aen 
55 

Glu Leu Arg Tyr 

70 

Asn Phe Arg Gin 
85 

Ser Ala Gly Gin 



Asn Pro Ala Arg 
120 

Val Ala Ser Leu 
135 
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Leu Arg Arg Thr 
10 

Thr Ala Pro Leu 
25 

Thr Thr Asn Leu 



Tyr Phe Asp Phe 
60 

Glu Ser Phe Ala 
75 

Pro Gin Glu Leu 
90 

Glu Val Tyr Phe 
105 

Glu Ala Lys Glu 



Phe 



Gly Val Gly Thr 
15 

Thr Gin Arg Tyr 
30 

Phe He Gly He 
45 

Gly Arg Thr Met 



Thr Arg Glu Phe 
80 

He Ser Val Phe 
95 

Asp Arg He Glu 
110 

Asp Leu Glu Ala 
125 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

CCAGGTTTTA CAGGAGGGGA TATACTCCGA AGAACAGGGG TTGGTACATT TGGAACAATA 60 

AGGGTAAGGA CTACTGCCCC CTTAACACAA AGATATCGCA TAAGATTCCG TTTCGCTTCT 120 

ACCACAAATT TGTTCATTGG TATAAGAGTT GGTGATAGAC AAGTAAATTA TTTTGACTTC 180 

GGAAGAACAA TGAACAGAGG AGATGAATTA AGGTACGAAT CTTTTGCTAC AAGGGAGTTT 24 0 

ACTACTGATT TTAATTTTAG ACAACCTCAA GAATTAATCT CAGTGTTTGC AAATGCATTT 3 00 

AGCGCTGGTC AAGAAGTTTA TTTTGATAGA ATTGAGATTA TCCCCGTTAA TCCCGCACGA 3 60 



GAGGCGAAAG AGGATCTAGA AGCAGCAAAG AAAGCGGTGG CGAGCTTGTT TA 



412 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Pro Gly Phe Thr Gly Gly Asp Val'fle Arg Arg Thr Asn Thr Gly Gly 
1 5 10 15 

Phe Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg lie Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp He He Arg Thr Ser 
85 90 95 

He Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 110 

He lie Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Xaa Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCAGGATTTA CAGGAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGGAGCAATA 60 
AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 12 0 
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ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 180 

ACACGTACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 24 0 

ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 3 00 

AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 36 0 

GAAGCGGAAG AGGATTTWGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAC 413 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:55: 

Pro Gly Phe lie Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg lie Arg Val Arg Tyr Ala Ser Thr Thr Asn lie Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly Thr lie Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 ' 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe 
100 105 • 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Xaa Asp Leu Xaa Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i). SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 12 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 18 0 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 300 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 360 

GCGAAAGAKG ATCTABAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 137 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 57: 

Pro Gly Phe Thr Gly Gly Asp Val He Arg Arg Thr Asn Thr Gly Gly 
15 10 15 

Phe Gly Ala lie Arg Val Ser Val Thr Gly Pro Leu Thr Gin Arg Tyr 
20 25 30 

Arg lie Arg Phe Arg Tyr Ala Ser Thr He Asp Phe Asp Phe Phe Val 
35 40 45 

Thr Arg Gly Gly Thr Thr He Asn Asn Phe Arg Phe Thr Arg Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Ser Arg Tyr Glu Ser Tyr Arg Thr Val Glu Phe 
65 70 75 80 

Thr Thr Pro Phe Asn Phe Thr Gin Ser Gin Asp He He Arg Thr Ser 
85 90 95 

lie Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr Leu Asp Arg He Glu 
100 105 HO 
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lie He Pro Val Asn Pro Thr Arg Glu Ala Glu Glu Asp Leu Glu Ala 
115 120 125 

Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 413 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

CCAGGWTTTA CAGGAGGAGA TGTAATCCGA AGAACAAATA CTGGTGGATT CGG AG CAATA 6 0 

AGGGTGTCGG TCACTGGACC GCTAACACAA CGATATCGCA TAAGGTTCCG TTATGCTTCG 12 0 

ACAATAGATT TTGATTTCTT TGTAACACGT GGAGGAACTA CTATAAATAA TTTTAGATTT 180 

ACACGTACAA TGAACAGGGG ACAGGAATCA AGATATGAAT CCTATCGTAC TGTAGAGTTT 24 0 

ACAACTCCTT TTAACTTTAC ACAAAGTCAA GATATAATTC GAACATCTAT CCAGGGACTT 3 00 

AGTGGAAATG GGGAAGTATA CCTTGATAGA ATTGAAATCA TCCCTGTAAA TCCAACACGA 36 0 

GAAGCGGAAG AGGATTTAGA AGCGGCGAAG AAAGCGGTGG CGAGCTTGTT TAC 413 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Pro Gly Phe Xaa Gly Gly Gly He Leu Arg Arg Thr Thr Asn Gly Thr 
15 10 15 

Phe Gly Thr Leu Arg Val Thr Val Asn Ser Pro Leu Thr Gin Arg Tyr 
20 25 30 



Arg Val Arg Val Arg Phe Ala Ser Ser Gly Asn Phe Ser He Arg He 
35 40 45 
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Leu Arg Gly Asn Thr Ser lie Ala Tyr Gin Arg Phe Gly Ser Thr Met 
50 55 60 

Asn Arg Gly Gin Glu Leu Thr Tyr Glu Ser Phe Val Thr Ser Glu Phe 
65 70 75 60 

Thr Thr Asn Gin Ser Asp Leu Pro Phe Thr Phe Thr Gin Ala Gin Glu 
85 90 95 

Asn Leu Thr He Leu Ala Glu Gly Val Ser Thr Gly Ser Glu Tyr Phe 
100 - 105 HO 

He Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Glu 
115 120 125 

Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

CCAGGWTTTA YAGGAGGGGG TATACTCCGA AGAACAACTA ATGGCACATT TGGAACGTTA 60 

AGAGTAACAG TTAATTCACC ATTAACACAA AGATATCGCG TAAGAGTTCG TTTTGCTTCA 12 0 

TCAGGAAATT TCAGCATAAG GATACTGCGT GGAAATACCT CTATAGCTTA TCAAAGATTT 180 

GGGAGTACAA TGAACAGAGG ACAGGAACTA ACTTACGAAT CATTTGTCAC AAGTGAGTTC 24 0 

ACTACTAATC AGAGCGATCT GCCTTTTACA TTTACACAAG CTCAAGAAAA TTTAACAATC 300 

CTTGCAGAAG GTGTTAGCAC CGGTAGTGAA TATTTTATAG ATAGAATTGA AATCATCCCT 3 60 

GTGAACCCGG CACGAGAAGC AGAAGAGGAT TTAGAAGCAG CGAAGAAAGC GGTGGCGAGC 420 

TTGTTTAC 428 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION; SEQ ID NO:61: 

Pro Gly Phe He Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser 
15 10 15 

Leu Gly Val Leu Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr 
20 25 30 

Arg He Arg Val Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val 
35 40 45 

Asn Gly Ser Phe Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg 
50 55 60 

Leu Gly Glu Asp Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn 
65 70 75 80 

Thr Ser He Arg Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He 
85 90 95 

Glu Pro Ser Phe lie Arg Gin Glu Val Tyr Val Asp Arg lie Glu Phe 
100 105 110 

He Pro Val Asn Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala 
115 120 125 

Lys Lys Ala Val Ala Ser Leu Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

CCAGGTTTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 60 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 120 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 18 0 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 24 0 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 3 00 



ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 



360 



WO 98/00546 PCT/US97/1 1658 

80 

GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC 410 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 63: 

GTTCATTGGT ATAAGAGTTG GTG 2 3 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 64: 
CCACTGCAAG TCCGGACCAA ATTCG 25 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GAATATATTC CCGTCYATCT CTGG 24 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 
GCACGAATTA CTGTAGCGAT AGG 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base.pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GCTGGTAACT TTGGAGATAT GCGTG 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 68: 
GATTTCTTTG TAACACGTGG AGG 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 69: 
CACTACTAAT CAGAGCGATC TG 



(2) INFORMATION FOR SEQ ID NO: 70: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1156 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 70: 

Met Asn Gin Asn Lys His Gly lie He Gly Ala Ser Asn Cys Gly Cys 
1 5 10 15 

Ala Ser Asp Asp Val Ala Lys Tyr Pro Leu Ala Asn Asn Pro Tyr Ser 
20 25 30 

Ser Ala Leu Asn Leu Asn Ser Cys Gin Asn Ser Ser lie Leu Asn Trp 
35 40 45 

He Asn He He Gly Asp Ala Ala Lys Glu Ala Val Ser He Gly Thr 
50 55 60 

Thr He Val Ser Leu He Thr Ala Pro Ser Leu Thr Gly Leu lie Ser 
65 70 75 80 

lie Val Tyr Asp Leu He Gly Lys Val Leu Gly Gly Ser Ser Gly Gin 
85 90 95 



Ser He Ser Asp Leu Ser He Cys Asp Leu Leu Ser He He Asp Leu 
100 105 110 

Arg Val Ser Gin Ser Val Leu Asn Asp Gly He Ala Asp Phe Asn Gly 

115 120 125 

Ser Val Leu Leu Tyr Arg Asn Tyr Leu Glu Ala Leu Asp Ser Trp Asn 
130 135 140 



Lys Asn Pro Asn Ser Ala Ser Ala 
145 150 

He Ala Asp Ser Glu Phe Asp Arg 
165 

Asn Gly Gly Ser Leu Ala Arg Gin 
180 

Ser Phe Ala Ser Ala Ala Phe Phe 

195 200 



Glu Glu Leu Arg Thr Arg Phe Arg 
155 160 

He Leu Thr Arg Gly Ser Leu Thr 
170 175 

Asn Ala Gin He Leu Leu Leu Pro 
185 190 

His Leu Leu Leu Leu Arg Asp Ala 
205 



Thr Arg Tyr Gly Thr Asn Trp Gly Leu Tyr Asn Ala Thr Pro Phe He 
210 215 220 

Asn Tyr Gin Ser Lys Leu Val Glu Leu He Glu Leu Tyr Thr Asp Tyr 
225 230 235 240 
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Cys Val His Trp Tyr Asn Arg Gly Phe Asn Glu Leu Arg Gin Arg Gly 
245 250 255 

Thr Ser Ala Thr Ala Trp Leu Glu Phe His Arg Tyr Arg Arg Glu Met 
260 265 270 

Thr Leu Met Val Leu Asp He Val Ala Ser Phe Ser Ser Leu Asp He 
275 280 285 

Thr Asn Tyr Pro He Glu Thr Asp Phe Gin Leu Ser Arg Val lie Tyr 
290 295 300 

Thr Asp Pro He Gly Phe Val His Arg Ser Ser Leu Arg Gly Glu Ser 
305 310 315 320 

Trp Phe Ser Phe Val Asn Arg Ala Asn Phe Ser Asp Leu Glu Asn Ala 
325 330 335 

lie Pro Asn Pro Arg Pro Ser Trp Phe Leu Asn Asn Met He He Ser 
340 345 350 

Thr Gly Ser Leu Thr Leu Pro Val Ser Pro Ser Thr Asp Arg Ala Arg 
355 360 365 

Val Trp Tyr Gly Ser Arg Asp Arg lie Ser Pro Ala Asn Ser Gin Phe 
370 375 3B0 

He Thr Glu Leu lie Ser Gly Gin His Thr Thr Ala Thr Gin Thr lie 
385 390 395 400 

Leu Gly Arg Asn He Phe Arg Val Asp Ser Gin Ala Cys Asn Leu Asn 
405 410 415 

Asp Thr Thr Tyr Gly Val Asn Arg Ala Val Phe Tyr His Asp Ala Ser 
420 425 430 

Glu Gly Ser Gin Arg Ser Val Tyr Glu Gly Tyr He Arg Thr Thr Gly 
435 440 445 

lie Asp Asn Pro Arg Val Gin Asn He Asn Thr Tyr Leu Pro Gly Glu 
450 455 460 

Asn Ser Asp He Pro Thr Pro Glu Asp Tyr Thr His He Leu Ser Thr 
465 470 475 480 

Thr He Asn Leu Thr Gly Gly Leu Arg Gin Val Ala Ser Asn Arg Arg 
485 490 49S 



Ser Ser Leu Val Met Tyr Gly Trp Thr His Lys Ser Leu Ala Arg Asn 
500 505 510 



Asn Thr lie Asn Pro Asp Arg lie Thr Gin lie Pro Leu Thr Lys Val 
515 520 525 
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Asp Thr Arg Gly Thr Gly Val Ser Tyr Val Asn Asp Pro Gly Phe lie 
530 535 540 

Gly Gly Ala Leu Leu Gin Arg Thr Asp His Gly Ser Leu Gly Val Leu 
545 550 555 560 

Arg Val Gin Phe Pro Leu His Leu Arg Gin Gin Tyr Arg He Arg Val 
565 570 575 

Arg Tyr Ala Ser Thr Thr Asn He Arg Leu Ser Val Asn Gly Ser Phe 

580 "" 585 ..- - 590 

Gly Thr He Ser Gin Asn Leu Pro Ser Thr Met Arg Leu Gly Glu Asp 
595 600 605 

Leu Arg Tyr Gly Ser Phe Ala He Arg Glu Phe Asn Thr Ser He Arg 
610 615 620 

Pro Thr Ala Ser Pro Asp Gin He Arg Leu Thr He Glu Pro Ser Phe 
625 630 635 640 

He Arg Gin Glu Val Tyr Val Asp Arg He Glu Phe He Pro Val Asn 
645 650 655 

Pro Thr Arg Glu Ala Lys Glu Asp Leu Glu Ala Ala Lys Lys Ala Val 
660 665 670 

Ala Ser Leu Phe Thr Arg Thr Arg Asp Gly Leu Gin Val Asn Val Lys 
675 680 685 

Asp Tyr Gin Val Asp Gin Aia Ala Asn Leu Val Ser Cys Leu Ser Asp 
690 695 700 

Glu Gin Tyr Gly Tyr Asp Lys Lys Met Leu Leu Glu Ala Val Arg Ala 
705 710 715 720 

Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe 
725 730 735 

Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly 
740 745 750 

Val Thr He Ser Glu Gly Gly Pro Phe Tyr Lys Gly Arg Ala He Gin 
755 760 765 

Leu Ala Ser Ala Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gin Lys Val 
770 775 780 



Asp Ala Ser Glu Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly Phe 
785 . 790 795 800 



Val Lys Ser Ser Gin Asp Leu Glu He Asp Leu He His His His Lys 
805 810 815 
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Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr 
620 825 B30 

Pro Asp Asp Ser Cys Ser Gly lie Asn Arg Cys Gin Glu Gin Gin Met 
835 840 845 

Val Asn Ala Gin Leu Glu Thr Glu His His His Pro Met Asp Cys Cys 
850 855 860 

Glu Ala Ala Gin Thr His Glu Phe Ser Ser Tyr lie Asp Thr Gly Asp 
665 870 875 860 

Leu Asn Ser Ser Val Asp Gin Gly lie Trp Ala lie Phe Lys Val Arg 
885 890 895 

Thr Thr Asp Gly Tyr Ala Thr Leu Gly Asn Leu Glu Leu Val Glu Val 
900 905 910 

Gly Pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gin Arg Asp Asn Thr 
915 920 925 

Lys Trp Ser Ala Glu Leu Gly Arg Lys Arg Ala Glu Thr Asp Arg Val 
930 935 940 

Tyr Gin Asp Ala Lys Gin Ser lie Asn His Leu Phe Val Asp Tyr Gin 
945 950 955 960 

Asp Gin Gin Leu Asn Pro Glu lie Gly Met Ala Asp lie Met Asp Ala 
965 970 975 

Gin Asn Leu Val Ala Ser He Ser Asp Val Tyr Ser Asp Ala Val Leu 
960 985 990 

Gin He Pro Gly He Asn Tyr Glu He Tyr Thr Glu Leu Ser Asn Arg 
995 1000 1005 

Leu Gin Gin Ala Ser Tyr Leu Tyr Thr Ser Arg Asn Ala Val Gin Asn 
1010 1015 1020 

Gly Asp Phe Asn Asn -Gly Leu Asp Ser Trp Asn Ala Thr Ala Gly Ala 
1025 1030 1035 1040 

Ser Val Gin Gin Asp Gly Asn Thr His Phe Leu Val Leu Ser His Trp 
1045 1050 1055 

Asp Ala Gin Val Ser Gin Gin Phe Arg Val Gin Pro Asn Cys Lys Tyr 
1060 1065 1070 

Val Leu Arg Val Thr Ala Glu Lys Val Gly Gly Gly Asp Gly Tyr Val 
1075 1080 1085 



Thr He Arg Asp Asp Ala His His Thr Glu Thr Leu Thr Phe Asn Ala 
1090 1095 HOO 
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Cys Asp Tyr Asp lie Asn Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu 
1105 1110 1115 1120 

Thr Lys Glu Val Val Phe His Pro Glu Thr Gin His Met Trp Val Glu 
1125 1130 1135 

Val Asn Glu Thr Glu Gly Ala Phe His He Asp Ser He Glu Phe Val 
1140 1145 1150 

Glu Thr Glu Lys 
1155 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3471 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

ATGAATCAAA ATAAACACGG AATTATTGGC GCTTCCAATT GTGGTTGTGC ATCTGATGAT 60 

GTTGCGAAAT ATCCTTTAGC CAACAATCCA TATTCATCTG CTTTAAATTT AAATTCTTGT 12 0 

CAAAATAGTA GTATTCTCAA CTGGATTAAC ATAATAGGCG ATGCAGCAAA AGAAGCAGTA 180 

TCTATTGGGA CAACCATAGT CTCTCTTATC ACAGCACCTT CTCTTACTGG ATTAATTTCA 24 0 

ATAGTATATG ACCTTATAGG TAAAGTACTA GGAGGTAGTA GTGGACAATC CATATCAGAT 300 

TTGTCTATAT GTGACTTATT ATCTATTATT GATTTACGGG TAAGTCAGAG TGTTTTAAAT 36 0 

GATGGGATTG CAGATTTTAA TGGTTCTGTA CTCTTATACA GGAACTATTT AGAGGCTCTG 42 0 

GATAGCTGGA ATAAGAATCC TAATTCTGCT TCTGCTGAAG AACTCCGTAC TCGTTTTAGA 48 0 

ATCGCCGACT CAGAATTTGA TAGAATTTTA ACCCGAGGGT CTTTAACGAA TGGTGGCTCG 54 0 

TTAGCTAGAC AAAATGCCCA AATATTATTA TTACCTTCTT TTGCGAGCGC TGCATTTTTC 600 

CATTTATTAC TACTAAGGGA TGCTACTAGA TATGGCACTA ATTGGGGGCT ATACAATGCT 660 

ACACCTTTTA TAAATTATCA ATCAAAACTA GTAGAGCTTA TTGAACTATA TACTGATTAT 72 0 

TGCGTACATT GGTATAATCG AGGTTTCAAC GAACTAAGAC AACGAGGCAC TAGTGCTACA 78 0 

GCTTGGTTAG AATTTCATAG ATATCGTAGA GAGATGACAT TGATGGTATT AGATATAGTA 84 0 

GCATCATTTT CAAGTCTTGA TATTACTAAT TACCCAATAG AAACAGATTT TCAGTTGAGT 90 0 

AGGGTCATTT ATACAGATCC AATTGGTTTT GTACATCGTA GTAGTCTTAG GGGAGAAAGT 960 
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TGGTTTAGCT TTGTTAATAG AGCTAATTTC TCAGATTTAG AAAATGCAAT ACCTAATCCT 1020 

AGACCGTCTT GGTTTTTAAA TAATATGATT ATATCTACTG GTTCACTTAC ATTGCCGGTT 1080 

AGCCCAAGTA CTGATAGAGC GAGGGTATGG TATGGAAGTC GAGATCGAAT TTCCCCTGCT 114 0 

AATTCACAAT TTATTACTGA ACTAATCTCT GGACAACATA CGACTGCTAC ACAAACTATT 12 00 

TTAGGGCGAA ATATATTTAG AGTAGATTCT CAAGCTTGTA ATTTAAATGA TACCACATAT 1260 

GGAGTGAATA GGG CGGTATT TTATCATGAT GCGAGTGAAG GTTCTCAAAG ATCCGTGTAC_ 1320 

GAGGGGTATA TTCGAACAAC TGGGATAGAT AACCCTAGAG TTCAAAATAT TAACACTTAT 13 80 

TTACCTGGAG AAAATTCAGA TATCCCAACT CCAGAAGACT ATACTCATAT ATTAAGCACA 144 0 

ACAATAAATT TAACAGGAGG ACTTAGACAA G TAG CATC TA ATCGCCGTTC ATCTTTAGTA 1500 

ATGTATGGTT GGACACATAA AAGTCTGGCT CGTAACAATA CCATTAATCC AGATAGAATT 1560 

ACACAGATAC CATTGACGAA GGTTGATACC CGAGGCACAG GTGTTTCTTA TGTGAATGAT 162 0 

CCAGGATTTA TAGGAGGAGC TCTACTTCAA AGGACTGACC ATGGTTCGCT TGGAGTATTG 1680 

AGGGTCCAAT TTCCACTTCA CTTAAGACAA CAATATCGTA TTAGAGTCCG TTATGCTTCT 174 0 

ACAACAAATA TTCGATTGAG TGTGAATGGC AGTTTCGGTA CTATTTCTCA AAATCTCCCT 1800 

AGTACAATGA GATTAGGAGA GGATTTAAGA TACGGATCTT TTGCTATAAG AGAGTTTAAT 1860 

ACTTCTATTA GACCCACTGC AAGTCCGGAC CAAATTCGAT TGACAATAGA ACCATCTTTT 192 0 

ATTAGACAAG AGGTCTATGT AGATAGAATT GAGTTCATTC CAGTTAATCC GACGCGAGAG 1980 

GCGAAAGAGG ATCTAGAAGC AGCAAAAAAA GCGGTGGCGA GCTTGTTTAC ACGCACAAGG 2 04 0 

GACGGATTAC AAGTAAATGT GAAAGATTAT CAAGTCGATC AAGCGGCAAA TTTAGTGTCA 2100 

TGCTTATCAG ATGAACAATA TGGGTATGAC AAAAAGATGT TATTGGAAGC GGTACGTGCG 2160 

GCAAAACGAC TTAGCCGAGA ACGCAACTTA CTTCAGGATC CAGATTTTAA TACAATCAAT 2220 

AGTACAGAAG AAAATGGATG GAAAGCAAGT AACGGCGTTA CTATTAGTGA GGGCGGGCCA 2 280 

TTCTATAAAG GCCGTGCAAT TCAGCTAGCA AGTGCACGAG AAAATTACCC AACATACATC 234 0 

TATCAAAAAG TAGATGCATC GGAGTTAAAG CCGTATACAC GTTATAGACT GGATGGGTTC 2 4 00 

GTGAAGAGTA GTCAAGATTT AGAAATTGAT CTCATTCACC ATCATAAAGT CCATCTTGTG 2460 

AAAAATGTAC CAGATAATTT AGTATCTGAT ACTTACCCAG ATGATTCTTG TAGTGGAATC t 252 0 

AATCGATGTC AGGAACAACA GATGGTAAAT GCGCAACTGG AAACAGAGCA TCATCATCCG 2 580 

ATGGATTGCT GTGAAGCAGC TCAAACACAT GAGTTTTCTT CCTATATTGA TACAGGGGAT 2 64 0 
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TTAAATTCGA GTGTAGACCA GGGAATCTGG GCGATCTTTA AAGTTCGAAC AACCGATGGT 2700 

TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTCGGAC CGTTATCGGG TGAATCTTTA 27 60 

GAACGTGAAC AAAGGGATAA TACAAAATGG AGTGCAGAGC TAGGAAGAAA GCGTGCAGAA 2 82 0 

ACAGATCGCG TGTATCAAGA TGCCAAACAA TCCATCAATC ATTTATTTGT GGATTATCAA 2 880 

GATCAACAAT TAAATCCAGA AATAGGGATG GCAGATATTA TGGACGCTCA AAATCTTGTC 2 94 0 
GCATCAATTT CAGATGTATA TAGCGATGCC GTACTGCAAA TCCCTGGAAT TAACTATGAG ""*""" 3000 

ATTTACACAG AGCTGTCCAA TCGCTTACAA CAAGCATCGT ATCTGTATAC GTCTCGAAAT 3 06 0 

GCGGTGCAAA ATGGGGACTT TAACAACGGG CTAGATAGCT GGAATGCAAC AGCGGGTGCA 312 0 

TCGGTACAAC AGGATGGCAA TACGCATTTC TTAGTTCTTT CTCATTGGGA TGCACAAGTT 318 0 

TCTCAACAAT TTAGAGTGCA GCCGAATTGT AAATATGTAT TACGTGTAAC AG C AG AG AAA 324 0 

GTAGGCGGCG GAGACGGATA CGTGACTATC CGGGATGATG CTCATCATAC AGAAACGCTT 33 00 

ACATTTAATG CATGTGATTA TGATATAAAT GGCACGTACG TGACTGATAA TACGTATCTA 33 6 0 

ACAAAAGAAG TGGTATTCCA TCCGGAGACA CAACACATGT GGGTAGAGGT AAATGAAACA 34 2 0 

GAAGGTGCAT TTCATATAGA TAGTATTGAA TTCGTTGAAA CAGAAAAGTA A 34 71 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1156 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met Asn Arg Asn Asn Gin Asn Glu Tyr Glu lie He Asp Ala Pro His 
15 10 15 

Cys Gly Cys Pro Ser Asp Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp 
20 25 30 

Pro Asn Ala Ala Leu Gin Asn Met Asn Tyr Lys Asp Tyr Leu Gin Met 
. 35 40 45 

Thr Asp Glu Asp Tyr Thr Asp Ser Tyr He Asn Pro Ser Leu Ser He 
50 55 60 

Ser Gly Arg Asp Ala Val Gin Thr Ala Leu Thr Val Val Gly Arg He 
65 70 75 80 
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Leu Gly Ala Leu Gly Val Pro Phe Ser Gly Gin lie Val Ser Phe Tyr 
85 90 95 

Gin Phe Leu Leu Asn Thr Leu Trp Pro Val Asn Asp Thr Ala He Trp 
100 105 110 

Glu Ala Phe Met Arg Gin Val Glu Glu Leu Val Asn Gin Gin He Thr 
115 120 125 

Glu„-Phe Ala Arg Asn Gin Ala Leu Ala Arg Leu Gin Gly Leu Gly Asp 
130 "' 135 140_ 

Ser Phe Asn Val Tyr Gin Arg Ser Leu Gin Asn Trp Leu Ala Asp Arg 
145 150 155 160 

Asn Asp Thr Arg Asn Leu Ser Val Val Arg Ala Gin Phe He Ala Leu 
165 170 175 

Asp Leu Asp Phe Val Asn Ala He Pro Leu Phe Ala Val Asn Gly Gin 
180 185 190 

Gin Val Pro Leu Leu Ser Val Tyr Ala Gin Ala Val Asn Leu His Leu 
195 200 205 

Leu Leu Leu Lys Asp Ala Ser Leu Phe Gly Glu Gly Trp Gly Phe Thr 
210 215 220 

Gin Gly Glu He Ser Thr Tyr Tyr Asp Arg Gin Leu Glu Leu Thr Ala 
225 230 235 240 

Lys Tyr Thr Asn Tyr Cys Glu Thr Trp Tyr Asn Thr Gly Leu Asp Arg 
245 250 255 

Leu Arg Gly Thr Asn Thr Glu Ser Trp Leu Arg Tyr His Gin Phe Arg 
260 265 270 

Arg Glu Met Thr Leu Val Val Leu Asp Val Val Ala Leu Phe Pro Tyr 
275 . 280 285 

Tyr Asp Val Arg Leu Tyr Pro Thr Gly Ser Asn Pro Gin Leu Thr Arg 
290 295 300 

Glu Val Tyr Thr Asp Pro He Val Phe Asn Pro Pro Ala Asn Val Gly 
305 310 315 320 

Leu Cys Arg Arg Trp Gly Thr Asn Pro Tyr Asn Thr Phe Ser Glu Leu 
325 330 335 



Glu Asn Ala Phe He Arg Pro Pro His Leu Phe Asp Arg Leu Asn Ser 
340 345 350 



Leu Thr He Ser Ser Asn Arg Phe Pro Val Ser Ser Asn Phe Met Asp 
355 360 365 
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Tyr Trp Ser Gly His Thr Leu Arg Arg Ser Tyr Leu Asn Asp Ser Ala 
370 375 3BO 

Val Gin Glu Asp Ser Tyr Gly Leu He Thr Thr Thr Arg Ala Thr He 
365 390 395 400 

Asn Pro Gly Val Asp Gly Thr Asn Arg He Glu Ser Thr Ala Val Asp 
405 410 415 

Phe Arg Ser Ala Leu He Gly He Tyr Gly Val Asn Arg Ala Ser Phe 
420^ 425 , 430 

Val Pro Gly Gly Leu Phe Asn Gly Thr Thr Ser Pro Ala Asn Gly Gly 
435 440 445 

Cys Arg Asp Leu Tyr Asp Thr Asn Asp Glu Leu Pro Pro Asp Glu Ser 
450 455 460 

Thr Gly Ser Ser Thr His Arg Leu Ser His Val Thr Phe Phe Ser Phe 
465 470 475 480 

Gin Thr Asn Gin Ala Gly Ser He Ala Asn Ala Gly Ser Val Pro Thr 
485 490 495 

Tyr Val Trp Thr Arg Arg Asp Val Asp Leu Asn Asn Thr He Thr Pro 
500 505 510 

Asn Arg He Thr Gin Leu Pro Leu Val Lys Ala Ser Ala Pro Val Ser 
515 520 525 

Gly Thr Thr Val Leu Lys Gly Pro Gly Phe Thr Gly Gly Gly He Leu 
530 535 540 

Arg Arg Thr Thr Asn Gly Thr Phe Gly Thr Leu Arg Val Thr Val Asn 
545 550 555 560 

Ser Pro Leu Thr Gin Arg Tyr Arg Val Arg Val Arg Phe Ala Ser Ser 
565 570 575 

Gly Asn Phe Ser He Arg He Leu Arg Gly Asn Thr Ser lie Ala Tyr 
580 585 590 

Gin Arg Phe Gly Ser Thr Met Asn Arg Gly Gin Glu Leu Thr Tyr Glu 
595 600 605 

Ser Phe Val Thr Ser Glu Phe Thr Thr Asn Gin Ser Asp Leu Pro Phe 
610 615 620 



Thr Phe Thr Gin Ala Gin Glu Asn Leu Thr lie Leu Ala Glu Gly Val 
625 630 635 640 



Ser Thr Gly Ser Glu Tyr Phe He Asp Arg lie Glu He lie Pro Val 
645 650 655 
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Asn Pro Ala Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala Lys Lys Ala 
660 665 670 

Val Ala Asn Leu Phe Thr Arg Thr Arg Asp Gly Leu Gin Val Asn Val 
675 680 685 

Thr Asp Tyr Gin Val Asp Gin Ala Ala Asn Leu Val Ser Cys Leu Ser 
690 695 700 

Asp Glu Gin Tyr Gly. His Asp Lys Lys Met Leu Leu Glu Ala Val Arg 
705 710 715 720 

Ala Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gin Asp Pro Asp 
725 730 735 

Phe Asn Thr He Asn Ser Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn 
740 745 750 

Gly Val Thr He Ser Glu Gly Gly Pro Phe Phe Lys Gly Arg Ala Leu 
755 760 765 

Gin Leu Ala Ser Ala Arg Glu Asn Tyr Pro Thr Tyr He Tyr Gin Lys 
770 775 780 

Val Asp Ala Ser Val Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp Gly 
785 790 795 800 

Phe Val Lys Ser Ser Gin Asp Leu Glu He Asp Leu lie His His His 
805 810 815 

Lys Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp Thr 
820 825 830 

Tyr Ser Asp Gly Ser Cys Ser Gly lie Asn Arg Cys Asp Glu Gin His 
B35 840 645 

Gin Val Asp Met Gin Leu Asp Ala Glu His His Pro Met Asp Cys Cys 
850 855 860 

Glu Ala Ala Gin Thr His Glu Phe Ser Ser Tyr He Asn Thr Gly Asp 
865 870 875 880 

Leu Asn Ala Ser Val Asp Gin Gly He Trp Val Val Leu Lye Val Arg 
885 890 695 

Thr Thr Asp Gly Tyr Ala Thr Leu Gly Asn Leu Glu Leu Val Glu Val 
900 905 910 



Gly Pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gin Arg Asp Asn Ala 
915 920 925 



Lys Trp Asn Ala Glu Leu Gly Arg Lys Arg Ala Glu He Asp Arg Val 
930 935 940 
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Tyr Leu Ala Ala Lys Gin Ala He Asn His Leu Phe Val Asp Tyr Gin 
945 950 955 960 

Asp Gin Gin Leu Asn Pro Glu He Gly Leu Ala Glu He Asn Glu Ala 
965 970 975 

Ser Asn Leu Val Glu Ser He Ser Gly Val Tyr Ser Asp Thr Leu Leu 
980 985 990 

Gin lie Pro Gly He Asn Tyr Glu He Tyr Thr Glu Leu Ser Asp Arg 
995 1000 1005 

Leu Gin Gin Ala Ser Tyr Leu Tyr Thr Ser Arg Asn Ala Val Gin Asn 
1010 1015 1020 

Gly Asp Phe Asn Ser Gly Leu Asp Ser Trp Asn Thr Thr Met Asp Ala 
1025 1030 1035 1040 

Ser Val Gin Gin Asp Gly Asn Met His Phe Leu Val Leu Ser His Trp 
1045 1050 1055 

Asp Ala Gin Val Ser Gin Gin Leu Arg Val Asn Pro Asn Cys Lys Tyr 
1060 1065 1070 

Val Leu Arg Val Thr Ala Arg Lys Val Gly Gly Gly Asp Gly Tyr Val 
1075 1080 1085 

Thr He Arg Asp Gly Ala His His Gin Glu Thr Leu Thr Phe Asn Ala 
1090 1095 1100 

Cys Asp Tyr Asp Val Asn Gly Thr Tyr Val Asn Asp Asn Ser Tyr He 
1105 H10 1115 1120 

Thr Glu Glu Val Val Phe Tyr Pro Glu Thr Lys His Met Trp Val Glu 
1125 1130 1135 

Val Ser Glu Ser Glu Gly Ser Phe Tyr He Asp Ser He Glu Phe He 
1140 1145 1150 

Glu Thr Gin Glu 
1155 



) INFORMATION FOR SEQ ID NO.-73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3471 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 73: 
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ATGAATCGAA ATAATCAAAA TGAATATGAA ATTATTGATG CCCCCCATTG TGGGTGTCCA 6 0 

TCAGATGACG ATCTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT ACAAAATATG 12 0 

AACTATAAAG ATTACTTACA AATGACAGAT GAGGACTACA CTGATTCTTA TATAAATCCT 180 

AGTTTATCTA TTAGTGGTAG AGATGCAGTT CAGACTGCGC TTACTGTTGT TGGGAGAATA 24 0 

CTCGGGGCTT TAGGTGTTCC GTTTTCTGGA CAAATAGTGA GTTTTTATCA ATTCCTTTTA 300 

AATACACTGT GGCCAGTTAA TGATACAGCT ATATGGGAAG CTTTCATGCG ACAGGTGGAG 3 60 

GAACTTGTCA ATCAACAAAT AACAGAATTT GCAAGAAATC AGGCACTTGC AAGATTGCAA 420 

GGATTAGGAG ACTCTTTTAA TGTATATCAA CGTTCCCTTC AAAATTGGTT GGCTGATCGA 4 80 

AATGATACAC GAAATTTAAG TGTTGTTCGT GCTCAATTTA TAGCTTTAGA CCTTGATTTT 54 0 

GTTAATGCTA TTCCATTGTT TGCAGTAAAT GGACAGCAGG TTCCATTACT GTCAGTATAT 600 

GCACAAGCTG TGAATTTACA TTTGTTATTA TTAAAAGATG CATCTCTTTT TGGAGAAGGA 660 

TGGGGATTCA CACAGGGGGA AATTTCCACA TATTATGACC GTCAATTGGA ACTAACCGCT 72 0 

AAGTACACTA . ATTACTGTGA AACTTGGTAT AATACAGGTT TAGATCGTTT AAGAGGAACA 78 0 

AATACTGAAA GTTGGTTAAG ATATCATCAA TTCCGTAGAG AAATGACTTT AGTGGTATTA 84 0 

GATGTTGTGG CGCTATTTCC ATATTATGAT GTACGACTTT ''ATCCAACGGG ATCAAACCCA 900 

CAGCTTACAC GTGAGGTATA TACAGATCCG ATTGTATTTA ATCCACCAGC TAATGTTGGA 960 

CTTTGCCGAC GTTGGGGTAC TAATCCCTAT AATACTTTTT CTGAGCTCGA AAATGCCTTC 1020 

ATTCGCCCAC CACATCTTTT TGATAGGCTG AATAGCTTAA CAATCAGCAG TAATCGATTT 10 BO 

CCAGTTTCAT CTAATTTTAT GGATTATTGG TCAGGACATA CGTTACGCCG TAGTTATCTG 114 0 

AACGATTCAG CAGTACAAGA AGATAGTTAT GGCCTAATTA CAACCACAAG AGCAACAATT 1200 

AATCCTGGAG TTGATGGAAC AAACCGCATA GAGTCAACGG CAGTAGATTT TCGTTCTGCA 1260 

TTGATAGGTA TATATGGCGT GAATAGAGCT TCTTTTGTCC CAGGAGGCTT GTTTAATGGT 132 0 

ACGACTTCTC CTGCTAATGG AGCATGTAGA GATCTCTATG ATACAAATGA TGAATTACCA 1380 

CCAGATGAAA GTACCGGAAG TTCTACCCAT AGACTATCTC ATGTTACCTT TTTTAGTTTT 144 0 

CAAACTAATC AGGCTGGATC TATAGCTAAT GCAGGAAGTG TACCTACTTA TGTTTGGACC 15 00 

CGTCGTGATG TGGACCTTAA TAATACGATT ACCCCAAATA GAATTACACA ATTACCATTG 1560 

GTAAAGGCAT CTGCACCTGT TTCGGGTACT ACGGTCTTAA AAGGTCCAGG ATTTACAGGA 162 0 

GGGGGTATAC TCCGAAGAAC AACTAATGGC ACATTTGGAA CGTTAAGAGT AACAGTTAAT 16 80 



WO 98/00546 PCT/US97/11658 

94 

TCACCATTAA CACAAAGATA TCGCGTAAGA GTTCGTTTTG CTTCATCAGG AAATTTCAGC 174 0 

ATAAGGATAC TGCGTGGAAA TACCTCTATA GCTTATCAAA GATTTGGGAG TACAATGAAC 18 00 

AGAGGACAGG AACTAACTTA CGAATCATTT GTCACAAGTG AGTTCACTAC TAATCAGAGC 18 60 

GATCTGCCTT TTACATTTAC ACAAGCTCAA GAAAATTTAA CAATCCTTGC AGAAGGTGTT 1920 

AGCACCGGTA GTGAATATTT TATAGATAGA ATTGAAATCA TCCCTGTGAA CCCGGCACGA 1980 

GAAGCAGAAG AGGATTTAGA_AGCAGCGAAG~AAAGCGGTGG CGAACTTGTT TACACGTACA 204 0 

AGGGACGGAT TACAGGTAAA TGTGACAGAT TATCAAGTGG ACCAAGCGGC AAATTTAGTG 2100 

TCATGCTTAT CCGATGAACA ATATGGGCAT GACAAAAAGA TGTTATTGGA AGCGGTAAGA 2160 

GCGGCAAAAC GCCTCAGCCG CGAACGCAAC TTACTTCAAG ATCCAGATTT TAATACAATC 22 20 

AATAGTACAG AAGAGAATGG CTGGAAGGCA AG TAA CGGTG TTACTATTAG CGAGGGCGGT 2 2 80 

CCATTCTTTA AAGGTCGTGC ACTTCAGTTA GCAAGCGCAA GAGAAAATTA TCCAACATAC 23 40 

ATTTATCAAA AAGTAGATGC ATCGGTGTTA AAGCCTTATA CACGCTATAG ACTAGATGGA 24 CO 

TTTGTGAAGA GTAGTCAAGA TTTAGAAATT GATCTCATCC AC CATC AT AA AGTCCATCTT 24 60 

GTAAAAAATG TACCAGATAA TTTAGTATCT GATACTTACT CAGATGGTTC TTGCAGCGGA 2 5 20 

ATCAACCGTT GTGATGAACA GCATCAGGTA GATATGCAGC TAGATGCGGA GCATCATCCA 25 80 

ATGGATTGCT GTGAAGCGGC TCAAACACAT GAGTTTTCTT CCTATATTAA TACAGGGGAT 264 0 

CTAAATGCAA GTGTAGATCA GGGCATTTGG GTTGTATTAA AAGTTCGAAC AACAGATGGG 2 700 

TATGCGACGT TAGGAAATCT TGAATTGGTA GAGGTTGGGC CATTATCGGG TGAATCTCTA 2760 

GAACGGGAAC AAAGAGATAA TGCGAAATGG AATGCAGAGC TAGGAAGAAA ACGTGCAGAA 2 820 

ATAGATCGTG TGTATTTAGC TGCGAAACAA GCAATTAATC ATCTGTTTGT AGACTATCAA 28 80 

GATCAACAAT TAAATCCAGA AATTGGGCTA GCAGAAATTA ATGAAGCTTC AAATCTTGTA 294 0 

GAGTCAATTT CGGGTGTATA TAGTGATACA CTATTACAGA TTCCTGGGAT TAACTACGAA 3 COO 

ATTTACACAG AGTTATCCGA TCGCTTACAA CAAGCATCGT ATCTGTATAC GTCTAGAAAT 3 060 

GCGGTGCAAA ATGGAGACTT TAACAGTGGT CTAGATAGTT GGAATACAAC TATGGATGCA 312 0 

TCGGTTCAGC AAGATGGCAA TATGCATTTC TTAGTTCTTT CGCATTGGGA TGCACAAGTT 3180 

TCCCAACAAT TGAGAGTAAA TCCGAATTGT AAGTATGTCT TACGTGTGAC AGCAAGAAAA 324 0 

GTAGGAGGCG GAGATGGATA CGTCACAATC CGAGATGGCG CTCATCACCA AGAAACTCTT 33 00 

ACATTTAATG CATGTGACTA CGATGTAAAT GGTACGTATG TCAATGACAA TTCGTATATA 33 60 
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ACAGAAGAAG TGGTATTCTA CCCAGAGACA AAACATATGT GGGTAGAGGT GAGTGAATCC 34 20 
GAAGGTTCAT TCTATATAGA CAGTATTGAG TTTATTGAAA CACAAGAGTA G 3471 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1150 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
"(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Met Asn Arg Asn Asn Pro A6n Glu Tyr Glu He He Asp Ala Pro Tyr 
1 5 10 15 

Cys Gly Cys Pro Ser Asp Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp 
20 25 30 

Pro Asn Ala Ala Phe Gin Asn Met Asn Tyr Lys Glu Tyr Leu Gin Thr 
35 40 45 

Tyr Asp Gly Asp Tyr Thr Gly Ser Leu He Asn Pro Asn Leu Ser He 
50 . 55 60 

Asn Pro Arg Asp Val Leu Gin Thr Gly lie Asn He Val Gly Arg He 
65 70 75 80 

Leu Gly Phe Leu Gly Val Pro Phe Ala Gly Gin Leu Val Thr Phe tyr 
65 90 95 

Thr Phe Leu Leu Asn Gin Leu Trp Pro Thr Asn Asp Asn Ala Val Trp 
100 105 110 

Glu Ala Phe Met Ala Gin lie Glu Glu Leu He Asp Gin Lys lie Ser 
115 120 125 

Ala Gin Val Val Arg Asn Ala Leu Asp Asp Leu Thr Gly Leu His Asp 
130 135 140 

Tyr Tyr Glu Glu Tyr Leu Ala Ala Leu Glu Glu Trp Leu Glu Arg Pro 
145 150 155 160 

Asn Gly Ala Arg Ala Asn Leu Val Thr Gin Arg Phe Glu Asn Leu His 
165 170 175 

Thr Ala Phe Val Thr Arg Met Pro Ser Phe Gly Thr Gly Pro Gly Ser 
180 185 190 



Gin Arg Asp Ala Val Ala Leu Leu Thr Val Tyr Ala Gin Ala Ala Asn 
195 200 205 
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Leu His Leu Leu Leu Leu Lys Asp Ala Glu He Tyr Gly Ala Arg Trp 
210 215 220 

Gly Leu Gin Gin Gly Gin He Asn Leu Tyr Phe Asn Ala Gin Gin Glu 
225 230 235 240 

Arg Thr Arg He Tyr Thr Asn His Cys Val Glu Thr Tyr Asn Arg Gly 
245 250 255 

Leu Glu Asp Val Arg Gly Thr Asn Thr Glu Ser Trp Leu Asn Tyr His 
260 265 270 

Arg Phe Arg Arg Glu Met Thr Leu Met Ala Met Asp Leu Val Ala Leu 
275 280 285 

Phe Pro Phe Tyr Asn Val Arg Gin Tyr Pro Asn Gly Ala Asn Pro Gin 
290 295 300 

Leu Thr Arg Glu He Tyr Thr Asp Pro He Val Tyr Asn Pro Pro Ala 
305 310 315 320 

Asn Gin Gly lie Cys Arg Arg Trp Gly Asn Asn Pro Tyr Asn Thr Phe 
325 330 335 



Ser Glu Leu Glu Asn Ala Phe He Arg Pro Pro His Leu Phe Glu Arg 
340 345 350 



Leu Asn Arg Leu Thr lie Ser Arg Asn Arg Tyr Thr Ala Pro Thr Thr 
355 360 365 

Asn Ser Phe Leu Asp Tyr Trp Ser Gly His Thr Leu Gin Ser Gin His 

370 375 380 



Ala Asn Asn Pro Thr Thr Tyr Glu Thr Ser Tyr Gly Gin He Thr Ser 

385 390 395 400 

Asn Thr Arg Leu Phe Asn Thr Thr Asn Gly Ala Arg Ala He Asp Ser 

405 410 415 



Arg Ala Arg Asn Phe Gly Asn Leu Tyr Ala Asn Leu Tyr Gly Val Ser 
420 425 430 

Ser Leu Asn lie Phe Pro Thr Gly Val Met Ser Glu lie Thr Asn Ala 
435 440 445 

Ala Asn Thr Cys Arg Gin Asp Leu Thr Thr Thr Glu Glu Leu Pro Leu 
450 455 460 

Glu Asn Asn Asn Phe Asn Leu Leu Ser His Val Thr Phe Leu Arg Phe 
465 470 475 480 



Asn Thr Thr Gin Gly Gly Pro Leu Ala Thr Leu Gly Phe Val Pro Thr 
485 490 495 
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Tyr Val Trp Thr Arg Glu Asp Val Asp Phe Thr Asn Thr lie Thr Ala 
500 505 510 

Asp Arg He Thr Gin Leu Pro Trp Val Lys Ala Ser Glu He Gly Gly 
515 520 525 

Gly Thr Thr Val Val Lys Gly Pro Gly Phe Thr Gly Gly Asp He Leu 
530 535 540 

Arg Arg Thr Asp Gly Gly Ala Val Gly Thr He Arg Ala Asn Val Asn 
545 * 550 555 - " 560 

Ala Pro Leu Thr Gin Gin Tyr Arg He Arg Leu Arg Tyr Ala Ser Thr 
565 570 575 



Thr Ser Phe Val Val Asn Leu Phe Val Asn Asn Ser Ala Ala Gly Phe 
580 585 590 

Thr Leu Pro Ser Thr Met Ala Gin Asn Gly Ser Leu Thr Tyr Glu Ser 
595 600 605 

Phe Asn Thr Leu Glu Val Thr His Thr lie Arg Phe Ser Gin Ser Asp 
610 615 620 

Thr Thr Leu Arg Leu Asn He Phe Pro Ser He Ser Gly Gin Glu Val 
625 ~ 630 635 640 

Tyr Val Asp Lys Leu Glu He Val Pro He Asn Pro Thr Arg Glu Ala 
645 650 655 

Glu Glu Asp Leu Glu Asp Ala Lys Lys Ala Val Ala Ser Leu Phe Thr 
660 665 670 

Arg Thr Arg Asp Gly Leu Gin Val Asn Val Thr Asp Tyr Gin Val Asp 
6 75 680 685 

Gin Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gin Tyr Gly His 
690 695 700 

Asp Lys Lys Met Leu Leu Glu Ala Val Arg Ala Ala Lys Arg Leu Ser 
705 * 710 715 720 

Arg Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe Asn Glu He Asn Ser 
725 730 735 

Thr Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly Val Thr He Ser Glu 
740 745 750 

Gly Gly Pro Phe Phe Lys Gly Arg Ala Leu Gin Leu Ala Ser Ala Arg 
755 760 765 

Glu Asn Tyr Pro Thr Tyr He Tyr Gin Lys Val Asp Ala Ser Thr Leu 
770 775 780 
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Lys Pro Tyr Thr Arg Tyr Lys Leu Asp Gly Phe Val Gin Ser Ser Gin 
7B5 790 795 BOO 

Asp Leu Glu lie Asp Leu He His His His Lys Val His Leu Val Lys 
805 810 815 

Asn Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Ser Asp Gly Ser Cys 
820 825 830 

Ser Gly lie Asn Arg Cys Glu Glu Gin His Gin Val Asp Val Gin Leu 
835 840 845 

Asp Ala Glu Asp His Pro Lys Asp Cys Cys Glu Ala Ala Gin Thr His 
850 855 860 

Glu Phe Ser Ser Tyr He His Thr Gly Asp Leu Asn Ala Ser Val Asp 
865 870 875 8B0 

Gin Gly He Trp Val Val Leu Gin Val Arg Thr Thr Asp Gly Tyr Ala 
885 890 895 

Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 
900 905 910 

Ser Leu Glu Arg Glu Gin Arg Asp Asn Ala Lys Trp Asn Glu Glu Val 
915 920 925 

Gly Arg Lys Arg Ala Glu Thr Asp Arg He Tyr Gin Asp Ala Lys Gin 
930 935 940 

Ala He Asn His Leu Phe Val Asp Tyr Gin Asp Gin Gin Leu Ser Pro 
945 950 955 960 

Glu Val Gly Met Ala Asp He He Asp Ala Gin Asn Leu He Ala Ser 
965 970 975 

He Ser Asp Val Tyr Ser Asp Ala Val Leu Gin He Pro Gly He Asn 
980 985 990 

Tyr Glu Met Tyr Thr Glu Leu Ser Asn Arg Leu Gin Gin Ala Ser Tyr 
995 1000 1005 

Leu Tyr Thr Ser Arg Asn Val Val Gin Asn Gly Asp Phe Asn Ser Gly 
1010 1015 1020 

Leu Asp Ser Trp Asn Ala Thr Thr Asp Thr Ala Val Gin Gin Asp Gly 
1025 1030 1035 1040 

Asn Met His Phe Leu Val Leu Ser His Trp Asp Ala Gin Val Ser Gin 
1045 1050 1055 



Gin Phe Arg Val Gin Pro Asn Cys Lys Tyr Val Leu Arg Val Thr Ala 
1060 1065 1070 
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Lys Lys Val Gly Asn Gly Asp Gly Tyr Val Thr lie Gin Asp Gly Ala 
1075 108O 1085 

His His Arg Glu Thr Leu Thr Phe Asn Ala Cys Asp Tyr Asp Val Asn 
1090 1095 1100 

Gly Thr His Val Asn Asp Asn Ser Tyr lie Thr Lys Glu Leu Val Phe 
1105 1110 1H5 1120 

Tyr Pro Lys Thr Glu His Met Trp Val Glu Val Ser Glu Thr Glu Gly 
1125 1130 H35 

Thr Phe Tyr lie Asp Ser lie Glu Phe lie Glu Thr Gin Glu 
1140 1145 H50 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3453 ba B e pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 75: 

ATGAATCGAA ATAATCCAAA TGAATATGAA ATTATTGATG CCCCCTATTG TGGGTGTCCG 60 

TCAGATGATG ATGTGAGGTA TCCTTTGGCA AGTGACCCAA ATGCAGCGTT CCAAAATATG 120 

AACTATAAAG AGTATTTACA AACGTATGAT GGAGACTACA CAGGTTCTCT TATCAATCCT 18 0 

AACTTATCTA TTAATCCTAG AGATGTACTA CAAACAGGTA TTAATATTGT GGGAAGAATA 24 0 

CTAGGGTTTT TAGGTGTTCC ATTTGCGGGT CAACTAGTTA CTTTCTATAC CTTTCTCTTA 3 00 

AATCAGTTGT GGCCAACTAA TGATAATGCA GTATGGGAAG CTTTTATGGC GCAAATAGAA 3 60 

GAGCTAATCG ATCAAAAAAT ATCGGCGCAA GTAGTAAGGA ATGCACTCGA TGACTTAACT 4 20 

GGATTACACG ATTATTATGA GGAGTATTTA GCAGCATTAG AGGAGTGGCT GGAAAGACCG 4 80 

AACGGAGCAA GAGCTAACTT AG TT AC AC AG AGG TTTG AAA ACCTGCATAC TGCATTTGTA 54 0 

ACTAGAATGC CAAGCTTTGG TACGGGTCCT GGTAGTCAAA GAGATGCGGT AGCGTTGTTG 6 00 

ACGGTATATG CACAAGCAGC GAATTTGCAT TTGTTATTAT TAAAAGATGC AGAAATCTAT 660 

GGGGCAAGAT GGGGACTTCA ACAAGGGCAA ATTAACTTAT ATTTTAATGC TCAACAAGAA 72 0 

CGTACTCGAA TTTATACCAA TCATTGCGTG GAAACATATA ATAGAGGATT AGAAGATGTA 7 60 

AGAGGAACAA ATACAGAAAG TTGGTTAAAT TACCATCGAT TCCGTAGAGA GATGACATTA 84 0 
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ATGGCAATGG ATTTAGTGGC CCTATTCCCA TTCTATAATG TGCGACAATA TCCAAATGGG 900 

GCAAATCCAC AGCTTACACG TGAAATATAT ACAGATCCAA TCGTATATAA TCCACCAGCT 96 0 

AATCAGGGAA TTTGCCGACG TTGGGGGAAT AATCCGTATA ATACATTTTC TGAACTTGAA 102 0 

AATGCTTTTA TTCGCCCGCC ACATCTTTTT GAAAGGTTGA ACAGATTAAC TATTTCTAGA 108 0 

AACCGATATA CAGCTCCAAC AACTAATAGC TTCCTAGACT ATTGGTCAGG TCATACTTTA 114 0 

CAAAGCCAAC ATGCAAATAA CCCGACGACA TATGAAACTA GTTACGGTCA GATTACCTCT 1200 

AACACACGTT TATTCAATAC GACTAATGGA GCCCGTGCAA TAGATTCAAG GGCAAGAAAT 126 0 

TTTGGTAACT TATACGCTAA TTTGTATGGC GTTAGCAGCT TGAACATTTT CCCAACAGGT 132 0 

GTGATGAGTG AAATCACCAA TGCAGCTAAT ACGTGTCGGC AAGACCTTAC TACAACTGAA 13 8 0 

GAACTACCAC TAGAGAATAA TAATTTTAAT CTTTTATCTC ATGTTACTTT CTTACGCTTC 14 4 0 

AATACTACTC AGGGTGGCCC CCTTGCAACT CTAGGGTTTG TACCCACATA TGTGTGGACA 150 0 

CGTGAAGATG TAGATTTTAC GAACACAATT ACTGCGGATA GAATTACACA ACTACCATGG 1560 

GTAAAGGCAT CTGAAATAGG TGGGGGTACT ACTGTCGTGA AAGGTCCAGG ATTTACAGGA 162 0 

GGGGATATAC TTCGAAGAAC GG ACGGTGGT GCAGTTGGAA CGATTAGAGC TAATGTTAAT 1680 

GCCCCATTAA CACAACAATA TCGTATAAGA TTACGCTATG CTTCGACAAC AAGTTTTGTT 174 0 

GTTAATTTAT TTGTTAATAA TAGTGCGGCT GGCTTTACTT TACCGAGTAC AATGGCTCAA 180 0 

AATGGTTCTT TAACATACGA GTCGTTTAAT ACCTTAGAGG TAACTCATAC TATTAGATTT 1860 

TCACAGTCAG ATACTACACT TAGGTTGAAT ATATTCCCGT CTATCTCTGG TCAAGAAGTG 192 0 

TATGTAGATA AACTTGAAAT CGTTCCAATT AACCCGACAC GAGAAGCGGA AGAAGATTTA . 198 0 

GAAGATGCAA AGAAAGCGGT GGCGAGCTTG TTTACACGTA CAAGGGATGG ATTACAGGTA 2 04 0 

AATGTGACAG ATTACCAAGT CGATCAGGCG GCAAATTTAG TGTCGTGCTT ATCAGATGAA 210 0 

CAATATGGGC ATGATAAAAA GATGTTATTG GAAGCCGTAC GCGCAGCAAA ACGCCTCAGC 2160 

CGCGAACGCA ACTTACTTCA AGATCCAGAT TTTAATGAAA TAAATAGCAC AGAAGAAAAT 222 0 

GGCTGGAAGG CAAGTAACGG TGTTACTATT AGCGAGGGCG GTCCATTCTT TAAAGGTCGT 228 0 

GCACTTCAGT TAGCAAGCGC ACGTGAAAAT TACCCAACAT ACATCTATCA AAAGGTAGAT 2 34 0 

GCATCGACGT TAAAACCTTA TACACGATAT AAACTAGATG GATTTGTGCA AAGTAGTCAA 24 0 0 

G ATT TAG AAA TTGACCTCAT TCATCATCAT AAAGTCCACC TCGTGAAAAA TGTACCAGAT 24 6 0 

AATTTAGTAT CTGATACTTA TTCTGATGGC TCATGTAGTG GAATTAACCG TTGTGAGGAA 2 52 0 
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CAACATCAGG TAGATGTGCA GCTAGATGCG GAGGATCATC CAAAGGATTG TTGTGAAG CG 2580 

GCTCAAACAC ATGAGTTTTC TTCCTATATT CATACAGGTG ATCTAAATGC AAGTGTAGAT 264 0 

CAAGGCATTT GGGTTGTATT GCAGGTTCGA ACAACAGATG GTTATGCGAC GTTAGGAAAT 2 700 

CTTGAATTGG TAGAGGTTGG TCCATTATCG GGTGAATCTT TAGAACGAGA ACAAAGAGAT 2760 

AATGCGAAAT GGAATGAAGA GGTAGGAAGA AAGCGTGCAG AAACAGATCG CATATATCAA 2 82 0 

GATGCGAAAC AAGCAATTAA CCATCTATTT GTAGACTATC AAGATCAACA ATTAAGTCCA ' 2 860 

GAGGTAGGGA TGGCGGATAT TATTGATGCT CAAAATCTTA TCGCATCAAT TTCAGATGTA 2 94 0 

TATAGCGATG CAGTACTGCA AATCCCTGGG ATTAACTACG AGATGTATAC AGAGTTATCC 3 000 

AATCGATTAC AACAAGCATC GTATCTGTAT ACGTCTCGAA ATGTCGTGCA AAATGGGGAC 3 060 

TTTAACAGTG GTTTAGATAG TTGGAATGCA ACAACTGATA CAGCTGTTCA GCAGGATGGC 3120 

AATATGCATT TCTTAGTTCT TTCCCATTGG GATGCACAAG TTTCT.CAACA ATTTAGAGTA 3180 

CAGCCGAATT GTAAATATGT GTTACGTGTG ACAGCGAAGA AAGTAGGGAA CGGAGATGGA 324 0 

TATGTTACGA TCCAAGATGG CGCTCATCAC CGAGAAACAC TGACATTCAA TGCATGTGAC 3 3 00 

TACGATGTAA ATGGTACGCA TGTAAATGAT AATTCGTATA TTACAAAAGA ATTGGTGTTC 33 60 

TATCCAAAGA CGGAACATAT GTGGGTAGAG GTAAGTGAAA CAGAAGGTAC CTTCTATATA 3 42 0 

GACAGCATTG AGTTCATTGA AACACAAGAG TAG 34 53 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1134 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: .protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:76: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
15 10 15 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg Gly Asn Val Arg 
20 25 30 

Thr Gly Leu Gin Thr Gly He Asp He Val Ala Val Val Val Gly Ala 
35 40 45 
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Leu Gly Gly Pro Val Gly Gly He Leu Thr Gly Phe Leu Ser Thr Leu 
50 55 60 

Phe Gly Phe Leu Trp Pro Ser Asn Asp Gin Ala Val Trp Glu Ala Phe 
65 70 75 60 

He Glu Gin Met Glu Glu Leu He Glu Gin Arg He Ser Asp Gin Val 
85 90 95 

Val Arg Thr Ala Leu Asp Asp Leu Thr Gly He Gin Asn Tyr Tyr Asn 
100 105 -110- 

Gln Tyr Leu He Ala Leu Lys Glu Trp Glu Glu Arg Pro Asn Gly Val 
115 120 125 

Arg Ala Asn Leu Val Leu Gin Arg Phe Glu lie Leu His Ala Leu Phe 
130 135 140 

Val Ser Ser Met Pro Ser Phe Gly Ser Gly Pro Gly Ser Gin Arg Phe 
145 150 155 160 

Gin Ala Gin Leu Leu Val Val Tyr Ala Gin Ala Ala Asn Leu His Leu 
165 170 175 

Leu Leu Leu Ala Asp Ala Glu Lys Tyr Gly Ala Arg Trp Gly Leu Arg 
180 165 190 

Glu Ser Gin He Gly Asn Leu Tyr Phe Asn Glu Leu Gin Thr Arg Thr 
195 200 205 

Arg Asp Tyr Thr Asn His Cys Val Asn Ala Tyr Asn Asn Gly Leu Ala 
210 215 220 

Gly Leu Arg Gly Thr Ser Ala Glu Ser Trp Leu Lys Tyr His Gin Phe 
225 230 235 240 

Arg Arg Glu Ala Thr Leu Met Ala Met Asp Leu He Ala Leu Phe Pro 
245 250 255 

Tyr Tyr Asn Thr Arg Arg Tyr Pro He Ala Val Asn Pro Gin Leu Thr 
260 -~ - 265 270 

Arg Glu Val Tyr Thr Asp Pro Leu Gly Val Pro Ser Glu Glu Ser Ser 
275 280 285 

Leu Phe Pro Glu Leu Arg Cys Leu Arg Trp Gin Glu Thr Ser Ala Met 
290 295 300 



Thr Phe Ser Asn Leu Glu Asn Ala lie He Ser Ser Pro His Leu Phe 
305 310 315 320 



Asp Thr lie Asn Asn Leu Met He Tyr Thr Gly Ser Phe Ser Val His 
325 330 335 
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Leu Thr Asn Gin Leu lie Glu Gly Trp He Gly His Ser Val Thr Ser 
340 345 350 

Ser Leu Leu Ala Ser Gly Pro Thr Thr Val Leu Arg Arg Asn Tyr Gly 
355 360 365 

Ser Thr Thr Ser He Val Asn Tyr Phe Ser Phe Asn Asp Arg Asp Val 
370 375 380 

Tyr Gin lie Asn Thr Arg Ser His Thr Gly Leu Gly Phe Gin Asn Ala 
385 390 395 — 400 

Pro Leu Phe Gly He Thr Arg Ala Gin Phe Tyr Pro Gly Gly Thr Tyr 
405 410 415 

Ser Val Thr Gin Arg Asn Ala Leu Thr Cys Glu Gin Asn Tyr Asn Ser 
420 425 430 

He Asp Glu Leu Pro Ser Leu Asp Pro Asn Glu Pro lie Ser Arg Ser 
435 440 445 

Tyr Ser His Arg Leu Ser His lie Thr Ser Tyr Leu His Arg Val Leu 
450 455 460 

Thr He Asp Gly lie Asn lie Tyr Ser Gly Asn Leu Pro Thr Tyr Val 
465 470 475 480 

Trp Thr His Arg Asp Val Asp Leu Thr Asn Thr He Thr Ala Asp Arg 
485 490 495 

lie Thr Gin Leu Pro Leu Val Lys Ser Phe Glu lie Pro Ala Gly Thr 
500 505 510 

Thr Val Val Arg Gly Pro Gly Phe Thr Gly Gly Asp lie Leu Arg Arg 
515 520 525 

Thr Gly Val Gly Thr Phe Gly Thr He Arg Val Arg Thr Thr Ala Pro 
530 535 540 

Leu Thr Gin Arg Tyr Arg lie Arg Phe Arg Phe Ala Ser Thr Thr Asn 
545 550 555 560 

Leu Phe lie Gly lie Arg Val Gly Asp Arg Gin Val Asn Tyr Phe Asp 
565 . 570 575 

Phe Gly Arg Thr Met Asn Arg Gly Asp Glu Leu Arg Tyr Glu Ser Phe 
580 585 590 



Ala Thr Arg Glu Phe Thr Thr Asp Phe Asn Phe Arg Gin Pro Gin Glu 
595 600 605 



Leu lie Ser Val Phe Ala Asn Ala Phe Ser Ala Gly Gin Glu Val Tyr 
610 615 620 
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Phe Asp Arg He Glu He He Pro Val Asn Pro Ala Arg Glu Ala Lys 
625 630 635 640 

Glu Asp Leu Glu Ala Ala Lys Lys Ala Val Ala Ser Leu Phe Thr Arg 
645 650 655 

Thr Arg Asp Gly Leu Gin Val Asn Val Lys Asp Tyr Gin Val Asp Gin 
660 665 670 

Ala Ala Asn Leu Val Ser Cys Leu Ser Asp Glu Gin Tyr Gly Tyr Asp 
675 6B0 685 

Lys Lys Met Leu Leu Glu Ala Val Arg Ala Ala Lys Arg Leu Ser Arg 
690 695 700 

Glu Arg Asn Leu Leu Gin Asp Pro Asp Phe Asn Thr He Asn Ser Thr 
705 710 715 720 

Glu Glu Asn Gly Trp Lys Ala Ser Asn Gly Val Thr He Ser Glu Gly 
725 730 735 

Gly Pro Phe Tyr Lys Gly Arg Ala Leu Gin Leu Ala Ser Ala Arg Glu 
740 745 750 

Asn Tyr Pro Thr Tyr lie Tyr Gin Lys Val Asp Ala Ser Glu Leu Lys 
755 760 765 

Pro Tyr Thr Arg Tyr Arg Ser Asp Gly Phe Val Lys Ser Ser Gin Asp 
770 775 780 

Leu Glu He Asp Leu He His His His Lys Val His Leu Val Lys Asn 
785 790 795 800 

Val Pro Asp Asn Leu Val Ser Asp Thr Tyr Pro Asp Asp Ser Cys Ser 
805 810 815 

Gly He Asn Arg Cys Gin Glu Gin Gin Met Val Asn Ala Gin Leu Glu 
820 825 830 

Thr Glu His His His Pro Met Asp Cys Cys Glu Ala Ala Gin Thr His 
835 840 845 

Glu Phe Ser Ser Tyr He Asp Thr Gly Asp Leu Asn Ser Ser Val Asp 
850 855 860 

Gin Gly He Trp Ala He Phe Lys Val Arg Thr Thr Asp Gly Tyr Ala 
865 870 875 880 



Thr Leu Gly Asn Leu Glu Leu Val Glu Val Gly Pro Leu Ser Gly Glu 
885 890 895 



Ser Leu Glu Arg Glu Gin Arg Asp Asn Thr Lys Trp Ser Ala Glu Leu 
900 905 910 
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Gly Arg Lys Arg Ala Glu Thr Asp Arg Val Tyr Gin Asp Ala Lys Gin 
915 920 925 

Ser He Asn His Leu Phe Val Asp Tyr Gin Asp Gin Gin Leu Asn Pro 
930 935 940 

Glu He Gly Met Ala Asp He Met Asp Ala Gin Asn Leu Val Ala Ser 
945 s 950 955 960 

He Ser Asp Val Tyr Ser Asp Ala Val Leu Gin He Pro Gly He Asn 

965 970 ~" 975 

Tyr Glu He Tyr Thr Glu Leu Ser Asn Arg Leu Gin Gin Ala Ser Tyr 
980 985 990 

Leu Tyr Thr Ser Arg Asn Ala Val Gin Asn Gly Asp Phe Asn Asn Gly 
995 1000 1005 



Leu Asp Ser Trp Asn Ala Thr Ala 
1010 1015 

Asn Thr His Phe Leu Val Leu Ser 
1025 1030 

Gin Phe Arg Val Gin Pro Asn Cys 
1045 

Glu Lys Val Gly Gly Gly Asp Gly 
1060 



Gly Ala Ser Val Gin Gin Asp Gly 
1020 

His Trp Asp Ala Gin Val Ser Gin 
1035 1040 

Lys Tyr Val Leu Arg Val Thr Ala 
1050 1055 

Tyr Val Thr He Arg Asp Gly Ala 
1065 1070 



His His Thr Glu Thr Leu Thr Phe Asn Ala Cys Asp Tyr Asp He Asn 
1075 1080 1085 

Gly Thr Tyr Val Thr Asp Asn Thr Tyr Leu Thr Lys Glu Val He Phe 
1090 1095 1100 

Tyr Ser His Thr Glu His Met Trp Val Glu Val Asn Glu Thr Glu Gly 
1105 1110 1115 1120 

Ala Phe His He Asp Ser He Glu Phe Val Glu Thr Glu Lys 
1125 1130 



INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3411 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 77: 
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ATGGATAACA ATCCGAACAT CAATGAATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA 60 

GTAGAAGTAT TAGGTGGAGA AAGAGGAAAT GTTAGAACTG GACTACAAAC TGGAATTGAT 12 0 

ATTGTTGCAG TAGTAGTAGG TGCTTTAGGT GGACCAGTTG GTGGCATACT CACTGGTTTT 180 

CTTTCTACTC TTTTTGGTTT TCTTTGGCCA TCTAATGATC AAGCAGTATG GGAAGCTTTT 24 0 

ATAGAACAAA TGGAAGAACT GATTGAACAA AGGATATCAG ATCAAGTAGT AAGGACTGCA 3 00 

CTCGATGACT TAACTGGAAT TCAAAATTAT TATAATCAAT ATCTAATAGC ATTAAAGGAA 3'60 ~ 

TGGGAGGAAA GACCAAACGG CGTAAGAGCA AACTTAGTTT TGCAAAGATT TGAAATCTTG 42 0 

CACGCGCTAT TTGTAAGTAG TATGCCAAGT TTTGGTAGTG GCCCTGGAAG TCAAAGGTTT 4 80 

CAGGCACAAT TGTTGGTTGT TTATGCGCAA GCAGCAAATC TTCATTTACT ATTATTAGCT 54 0 

GATGCTGAAA AGTATGGGGC AAGATGGGGA CTCCGTGAAT CCCAGATAGG AAATTTATAT 600 

TTTAATGAAC TACAAACTCG TACTCGAGAT TACACCAACC ATTGTGTAAA CGCGTATAAT 660 

AACGGGTTAG CCGGGTTACG AGGAACGAGC GCTGAAAGTT GGTTAAAGTA CCATCAATTC 72 0 

CGCAGAGAAG CAACCTTAAT GGCAATGGAT TTGATAGCTT TATTTCCATA TTATAACACC 7 80 

CGGCGATATC CAATCGCAGT AAATCCTCAG CTTACACGTG AG G TAT AT AC AGATCCATTA 84 0 

GGCGTTCCTT CTGAAGAATC AAGTTTATTT CCAGAATTGA GATGCTTAAG ATGGCAAGAG 900 

ACTTCTGCCA TGACTTTTTC AAATTTGGAA AATGCAATAA TTTCGTCACC ACATCTATTT 960 

GACACAATAA ACAATTTAAT GATTTATACC GGTTCCTTTT CCGTTCACCT AACCAATCAA 102 0 

TTAATTGAAG GGTGGATTGG ACATTCTGTA ACTAGTAGTT TGTTGGCCAG TGGACCAACA 1080 

ACAGTACTGA GAAGAAATTA CGGTAGCACG ACATCTATTG TAAACTATTT TAGTTTTAAT 114 0 

GATCGTGATG TTTATCAGAT TAATACGAGA TCACATACTG GGTTGGGATT CCAGAACGCA 1200 

CCTTTATTTG GAATCACTAG AGCTCAATTT TACCCAGGTG" GGACTTATTC AGTAACTCAA 12 60 

CGAAATGCAT TAACATGTGA ACAAAATTAT AATTCAATTG ATGAGTTACC GAGCCTAGAC 13 2 0 

CCAAATGAAC CTATCAGTAG AAGTTATAGT CATAGATTAT CTCATATTAC CTCCTATTTG 13 8 0 

CATCGTGTAT TGACTATTGA TGGTATTAAT ATATATTCAG GAAATCTCCC TACTTATGTA 144 0 

TGGACCCATC GCGATGTGGA CCTTACAAAC ACGATTACCG CAGATAGAAT TACACAACTA 150 0 

CCATTGGTAA AGTCATTTGA AATACCTGCG GGTACTACTG TCGTAAGAGG ACCAGGTTTT 1560 

ACAGGAGGGG ATATACTCCG AAGAACAGGG GTTGGTACAT TTGGAACAAT AAGGGTAAGG 162 0 

ACTACTGCCC CCTTAACACA AAGATATCGC ATAAGATTCC GTTTCGCTTC TACCACAAAT 168 0 
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TTGTTCATTG GTATAAGAGT TGGTGATAGA CAAGTAAATT ATTTTGACTT CGGAAGAACA 174 0 

ATGAACAGAG GAGATGAATT AAGGTACGAA TCTTTTGCTA CAAGGGAGTT TACTACTGAT 180 0 

TTTAATTTTA GACAACCTCA AGAATTAATC TCAGTGTTTG CAAATGCATT TAGCGCTGGT 1860 

CAAGAAGTTT ATTTTGATAG AATTGAGATT ATCCCCGTTA ATCCCGCACG AGAGGCGAAA 192 0 

GAGGATCTAG AAGCAGCAAA GAAAGCGGTG GCGAGCTTGT TTACACGCAC AAGGGACGGA 198 0 

TTACAAGTAA ATGTGAAAGA TTATCAAGTC GATCAAGCGG CAAATTTAGT GTCATGCTTA 2 04*0 

TCAGATGAAC AATATGGGTA TGACAAAAAG ATGTTATTGG AAGCGGTACG CGCGGCAAAA 2100 

CGCCTCAGCC GAGAACGTAA CTTACTTCAG GATCCAGATT TTAATACAAT CAATAGTACA 216 0 

GAAGAAAATG GATGGAAAGC AAGTAACGGC GTTACTATTA GTGAGGGCGG TCCATTCTAT 222 0 

AAAGGCCGTG CACTTCAGCT AGCAAGTGCA CGAGAAAATT ATCCAACATA CATTTATCAA 22 80 

AAAGTAGATG CATCGGAGTT AAAACCTTAT ACACGTTATA GATCAGATGG GTTCGTGAAG 234 0 

AGTAGTCAAG ATTTAGAAAT TGATCTCATT CACCATCATA AAGTCCATCT TGTGAAAAAT 24 00 

GTACCAGATA ATTTAGTATC TGATACTTAC CCAGATGATT CTTGTAGTGG AATCAATCGA 24 60 

TGTCAGGAAC AACAGATGGT AAATGCGCAA CTGGAAACAG AGCATCATCA TCCGATGGAT 2 5'2 0 

TGCTGTGAAG CAGCTCAAAC ACATGAGTTT TCTTCCTATA TTGATACAGG GGATTTAAAT 2 5 80 

TCGAGTGTAG ACCAGGGAAT CTGGGCGATC TTTAAAGTTC GAACAACCGA TGGTTATGCG 2 64 0 

ACGTTAGGAA ATCTTGAATT GGTAGAGGTC GGACCGTTAT CGGGTGAATC TTTAGAACGT 27 00 

GAACAAAGGG ATAATACAAA ATGGAGTGCA GAGCTAGGAA GAAAGCGTGC AGAAACAGAT 2760 

CGCGTGTATC AAGATGCCAA ACAATCCATC AATCATTTAT TTGTGGATTA TCAAGATCAA 2 82 0 

CAATTAAATC CAGAAATAGG GATGGCAGAT ATTATGGACG CTCAAAATCT TGTCGCATCA 2 880 

ATTTCAGATG TATATAGCGA TGCCGTACTG CAAATCCCTG GAATTAACTA TGAGATTTAC 2 94 0 

ACAGAGCTGT CCAATCGCTT ACAACAAGCA TCGTATCTGT ATACG TCT CG AAATGCGGTG 3000 

CAAAATGGGG ACTTTAACAA CGGGCTAGAT AGCTGGAATG CAACAGCGGG TGCATCGGTA 3 060 

CAACAGGATG GCAATACGCA TTTCTTAGTT CTTTCTCATT GGGATGCACA AGTTTCTCAA 3120 

CAATTTAGAG TGCAGCCGAA TTGTAAATAT GTATTACGTG TAACAGCAGA GAAAGTAGGC 3180 

GGCGGAGACG GATACGTGAC TATCCGGGAT GGTGCTCATC ATACAGAAAC GCTTACATTT 324 0 

AATGCATGTG ATTATGATAT AAATGGCACG TACGTGACTG ATAATACGTA TCTAACAAAA 3 3 00 

GAAGTGATAT TCTATTCACA TACAGAACAC ATGTGGGTAG AGGTAAATGA AACAGAAGGT 3360 



WO 98/00546 PCT/US97/1 1658 

108 

GCATTTCATA TAGATAGTAT TGAATTCGTT GAAACAGAAA AGTAAGGTAC C 3411 



(2) INFORMATION FOR SEQ ID NO: 78: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 4 0 4 5 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu lie Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu' Gin Glu He Ser Asp Lys Leu Asp He lie Asn Val Asn Val 
145 150 155 160 

Leu lie Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg lie 
165 170 175 



Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 



Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 
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Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Ala Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 



Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu lie Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 . 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu lie Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 



He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
465 490 495 
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Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu ABn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 JS55 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val Hie 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser lie His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 — - 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 



Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 



WO 98/00546 



PCI7US97/11658 



111 

Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2370 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 79: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 12 0 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA . 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 3 60 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAGCAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 
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TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 12 6 0 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 80 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 14 4 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 150 0 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC- TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATTGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 162 0 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

G TAG AT CAT A CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 186 0 

GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 1920 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2 04 0 

ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 216 0 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 234 0 

GTACATTTTT ACGATGTCTC TATTAAGTAA 2370 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 7B9 amino acids 
(8) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 80: 

Met Asn Lys Asp Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 
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lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
"65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Glu Val Asn Asn Lys Leu Glu Ala He Ser Thr 
100 105 110 

He Phe Arg Val Tyr Leu Pro Lys Asn Thr Ser Arg Gly Gly Gly Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin Met Glu Asn Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Val Lys Trp Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 2B5 



Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 
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Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 — 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr. Asp Ser Ser Thr Gly 
435 440 445 

Glu lie Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

lie Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu lie 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Xaa He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 



Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 



Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 
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Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser lie His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

_ Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn_.Gly Asp Glu 
- — '645 650 655 

Ala Trp Gly Asp Asn Phe He lie Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 6B5 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
70S 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr lie Glu 
755 760 765 



Leu Ser Gin Gly Asn. Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 



Asp Val Ser He Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 81: 



ATGAACAAGG ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 12 0 
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GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 3 00 

AATGAGGTTA ATAACAAACT CGAGGCGATA AGTACGATTT TTCGGGTATA TTTACCTAAA 360 

AATACCTCTA GGGGGGGGGG GGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATGGAA 42 0 

AACTTGAGTA AACAATTACA AGAGATTTCT GTTAAGTGGG ATATTATTAA TGTAAATGTA 4 80 

CTTATTAACT CTACACTTAC CGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCCGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 78 0 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 108 0 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGT CG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 12 00 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 BO 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 144 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACCGACTTA 156 0 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCSA TATTGTAGAG 162 0 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT IB 00 
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GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA I860 

GATACAAATA ATAAT7TAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 192 0 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 204 0 

ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 216 0 

GTGTATTTTT CTGTGT ccGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 222 0 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 22 80 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2 34 0 

GTTCATTTTT ACGATGTCTC TATTAAGTAA CCCAA 2 375 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:82: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 



85 



90 



95 



Asn 



Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 
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Met Leu His lie Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu lie Ser Asp Lys Leu Asp lie lie Asn Val Asn Val 
145 150 155 160 

Leu lie Asn Ser Thr Leu Thr Glu lie Thr Pro Ala Tyr Gin Arg lie 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 165 190 

Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 



Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 
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Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn lie Val Phe 
405 410 415 



Pro Asn Glu Tyr 
420 

Thr Leu Arg Tyr 
435 



Val He Thr Lys 

Glu Val Thr Ala 
440 



He Asp Phe Thr 
425 

Asn Phe Tyr Asp 



Lys Lys Met Lys 
430 

Ser Ser Thr Gly 
445 



Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 " 455- 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 



Asp Gly Asn Ser Arg Leu lie Thr 
500 

Glu Leu Leu Leu Ala Thr Asp Leu 

515 520 

Val Leu Pro Ser Gly Phe lie Ser 
530 535 

Glu Glu Asp Asn Leu Glu Pro Trp 
545 550 

Val Asp His Thr Gly Gly Val Asn 
565 

Lys Asp Gly Gly Phe Ser Gin Phe 
560 

Thr Glu Tyr Val He Gin Tyr Thr 
595 600 



Leu Thr Cys Lys Ser Tyr Leu Arg 
505 510 

Ser Asn Lys Glu Thr Lys Leu He 
525 

Asn He Val Glu Asn Gly Ser He 
540 

Lys Ala Asn Asn Lys Asn Ala Tyr 
555 560 

Gly Thr Lys Ala Leu Tyr Val His 
570 575 

lie Gly Asp Lys Leu Lys Pro Lys 
585 590 

Val Lys Gly Ly6 Pro Ser He His 
605 



Leu Lys Asp Glu Asn Thr Gly Tyr 
610 615 

Asn Leu Lys Asp Tyr Gin Thr He 
625 630 

Asp Leu Lys Gly Val Tyr Leu lie 
645 

Ala Trp Gly Asp Asn Phe He lie 
660 

Leu Leu Ser Pro Glu Leu lie Asn 

675 6B0 



He His Tyr Glu Asp Thr Asn Asn 
620 

Thr Lys Arg Phe Thr Thr Gly Thr 
635 640 

Leu Lys Ser Gin Asn Gly Asp Glu 
650 655 

Leu Glu lie Ser Pro Ser Glu Lys 
665 670 

Thr Asn Asn Trp Thr Ser Thr Gly 
685 
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Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
74 0 74 5 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Asn 
770 775 780 

Asp Val Ser lie Lys 
7B5 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 6 0 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 12 0 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

~ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 3 60 

ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 4 20 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 54 0 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 600 

TCGCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 660 
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AAAAATGACG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 78 0 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGGTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTTG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATACGGA TAAATTATTG 12 00 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 14 4 0 

ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 15 00 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 156 0 

AGCAATAAAG AAACTAAATT GATCGTCCTG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 162 0 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

TTTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 180 0 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA I8 60 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 192 0 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 198 0 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 204 0 

ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 222 0 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 2280 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2 34 0 
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GTACATTTTA ACGATGTCTC TATTAAGTAA CCCAA 23 75 



(2 J INFORMATION FOR SEQ ID NO: 84: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



Met Asn Lys Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
15 10 is 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu lie Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 no 

Met Leu His He Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
14 5 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 



Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 



Ser Ser Lys Val Lys Lys Asp Ser Pro Pro Ala Asp He Leu Asp Glu 
195 200 205 
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Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu lie 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 

260 — 265 ... 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 



Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 



He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 
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Asp Gly Asn Ser Arg Leu lie Thr Leu Thr Cys Lys Ser Tyr Leu Arg 

500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 

515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 

530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 

545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 

565 570 575 

Lys Asp Gly Gly Phe Ser Gin phe He Gly Asp Lys Leu Lys Pro Lys 

580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 

595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 

610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 

625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 

645 650 655 

Ala Trp Gly Asp Asn Phe He lie Leu Glu He Ser Pro Ser Glu Lys 

660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 

675 680 685 

Ser Thr His lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 

690 695 700 

Gly lie Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 

705 " 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 

725 730 735 

Arg Glu Val Leu Phe Glu Lys Gly Tyr Met Ser Gly Ala Lys Asp Val 

740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr lie Glu 
755 760 765 



Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 
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Asp Val Ser lie Lys 
765 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:85: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTATTTT 6 0 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 12 0 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 

ATTACATCTA TGTTAAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 42 0 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTC 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 54 0 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATAGC 60 0 

CCCCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTAACA 66 0 

AAAAATGACG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 84 0 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 96 0 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 12 00 
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TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 80 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 144 0 

ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1500 

_ AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 156 0 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 162 0 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

TTTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 192 0 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2 04 0 

ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 222 0 

TTTGAAAAAG GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 22 BO 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 2 34 0 

GTACATTTTT ACGATGTCTC TATTAAGTAA CCAAG 2 375 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 759 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Met Asn Lys Asn A6n Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
1 5 10 15 
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lie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 75 .— — BO 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg He Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Asn Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp lie lie Asn Val Asn Val 
145 150 155 160 

Leu lie Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Xaa Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Aen Asp Val 
210 215 220 

Asp Gly Phe Glu He Tyr Leu Asn Thr Phe His .Asp Val Met Val Gly 
225 230 235 240" 

Asn Asn Leu He Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu lie 
245 250 255 

Xaa Lya Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 2B5 



Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 
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Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp' 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser His 
530 535 540 

Arg Arg Gly Gin Phe Arg Ala Val Glu Ser Lys Glu Cys Val Cys Axg 
545 550 555 560 



Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe He Cys Ser Gly Arg Arg 
565 570 575 



Asn Phe Thr He Tyr Trp Arg Val Lys Thr Glu Asn Val Cys Asn Pro 
580 585 590 
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He Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys Tyr Trp He 
595 600 605 

Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr Tyr Thr Phe 
610 615 620 

Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys Lys Ser Lys 
625 630 635 640 

Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser Phe Lys Val 
645 650 — 655 

He Lys Ser Arg He Asn Tyr Lys Leu Asp Glu Tyr Gly He A6n Ser 
660 665 670 

Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn Ser Lys Thr 
675 680 685 

Lys Pro Ser He Arg Phe Phe Asn Leu Ser Val Phe Phe Cys Val Arg 
690 695 700 

Arg Cys Cys Lys Asp Lys Phe Gly Ser Val He Lys Lys He Tyr Glu 
705 710 715 720 

Arg Cys Arg Cys Phe Asn Val His Tyr Lys lie Glu Arg Leu Leu Tyr 
725 730 735 

Arg Ala Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys Thr Phe Leu 
740 745 750 

Arg Cys Leu Tyr Val Thr Gin 
755 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGAGCCCTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 12 0 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GG AAATCTTA AAAATTGCAA ATGAACAAAA TCAAGTCTTA 300 
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AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGATATA TCTACCTAAA 3 60 

ATTACATCTA TGTTAAGTGA TGTAATGAAC CAAAATTATG CGCTAAGTCT GCAAATAGAA 4 20 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTGG ATATTATTAA TGTAAATGTA 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAKTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 66 0 

AAAAATGATG TGGATGGTTT TGAAATTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAA TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAS TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA GGTAGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 96 0 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 12 00 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 12 60 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC GTTAGGTGTC 144 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 

AACGGGTCCC ATAGAAGAGG ACAATTTAGA GCCGTGGAAA GCAAATAATA AGAATGCGTA 1680 

TGTAGATCAT ACAGGCGGAG TGAATGGAAC TAAAGCTTTA TATGTTCATA AGGACGGAGG 174 0 

AATTTCACAA TTTATTGGAG ATAAGTTAAA ACCGAAAACT GAGTATGTAA TCCAATATAC 180 0 

TGTTAAAGGA AAACCTTCTA TTCATTTAAA AGATGAAAAT ACTGGATATA TTCATTATGA 1860 

AGATACAAAT AATAATTTAA AAGATTATCA AACTATTACT AAACGTTTTA CTACAGGAAC 1920 

TGATTTAAAG GGAGTGTATT TAATTTTAAA AAGTCAAAAT GGAGATGAAG CTTGGGGAGA 198 0 
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TAACTTTATT ATTTTGGAAA TTAGTCCTTC TGAAAAGTTA TTAAGTCCAG AATTAATTAA 2 04 0 
TACAAATAAT TGGACGAGTA CGGGATCAAC TCATATTAGC GGTAATACAC TCACTCTTTA 210 0 
TCAGGGAGGA CGAGGAATTC TAAAACAAAA CCTTCAATTA GATAGTTTTT CAACTTATAG 2160 
AGTGTATTTT TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT 22 2 0 
ATTTGAAAAA AGATATATGA GCGGTGCTAA AGATGTTTCT GAAATGTTCA CTACAAAATT 2280 
TGAGAAAGAT AACTTTTATA TAGAGCTTTC TCAAGGGAAT AATTTATATG GTGGTCCTAT- """2 34 0 
TGTACATTTT TACGATGTCT CTATTAAGTA ACCCAA 2 37 6 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 511 amino acids 
{B} TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

<ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Tyr Leu Ser Lys Gin Leu Gin Glu lie Ser Asp Lys Leu Asp lie He 
15 10 15 

Asn Val Asn Val Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala 
20 25 30 

Tyr Gin Arg He Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe 
35 40 45 

Ala Thr Glu Thr Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp 
50 55 60 

He Leu Asp Glu Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr 
65 70 75 80 

Lys Asn Asp Val Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp 
85 90 95 

Val Met val Gly Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala 
100- 105 110 

Ser Glu Leu He Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val 
115 120 125 

Gly Asn val Tyr Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys 
130 135 140 
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Ala Phe Leu Thr Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp 
145 150 155 160 

lie Asp Tyr Thr Ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu 
165 170 175 

Glu Phe Arg Val Asn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn 
180 185 190 

Pro Asn Tyr Ala Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He 
195 200 205 

Val Glu Ala Lys Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn 
210 , 215 220 

Asp Ser He Thr Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn 
225 230 235 240 

Tyr Gin Val Asp Lys Asp Pro Leu Ser Glu Val lie Tyr Gly Asp Thr 
245 250 255 

Asp Lys Leu Leu Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn 
260 265 270 

Asn He Val Phe Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr 
275 280 285 

Lys Lys Met Lys Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp 
290 295 300 

Ser Ser Thr Gly Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser 
305 310 315 320 

Glu Ala Glu Tyr Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met 
325 330 335 

Pro Leu Gly Val He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe 
340 345 350 

Gly Leu Gin Ala Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys 
3 55 360- 365 

Ser Tyr Leu Arg Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu 
370 375 380 

Thr Lys Leu He Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu 
385 390 395 400 



Asn Gly Ser He Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn 
405 410 415 



Lys Asn Ala Tyr Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala 
420 425 430 
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Leu Tyr Val His Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys 
435 440 445 

Leu Lys Pro Lys Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys 
450 455 460 

Pro Ser He His Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu 
465 470 475 480 

Asp Thr Asn Asn Asn. Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe 
485 490 495 

Thr Thr Gly Thr Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser 
500 505 510 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1533 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG ATATTATTAA CGTAAATGTT 60 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 12 0 

GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATAGC 18 0 

TCGCCTGCTG ATATTCTTGA TGAGTTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 24 0 

AAAAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 300 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 3 60 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 420 

CTACAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 4 80 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 54 0 

AACATCCTYC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 600 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 660 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 72 0 

TATCAAGTTG ATAAGGATCC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 7 80 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 84 0 
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GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 90 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 96 0 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 102 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 1080 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 114 0 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 12 00 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTG GAAAG CAAATAATAA GAATGCGTAT 12 60 

GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 13 2 0 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAAC TG AGTATGTAAT CCAATATACT 13 80 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 14 4 0 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 15 00 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGT 15 3 3 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

fxi) SEQUENCE DESCRIPTION : SEQ ID NO: 90: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

He Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 60 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 
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Asn Gin Val Leu A9n Asp Val Asp Asn Lys Leu Asp Ala lie Asn Thr 
100 105 no 

Met Leu Arg Val Tyr Leu Pro Lys lie Thr Xaa Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu lie Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 "~~ 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
1B0 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295. 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 



Pro Gly His Ala Leu Val Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 365 



Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 
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Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Abp Lye Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Pro Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Lys Leu Leu Leu Ala Thr Asp Phe Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Leu Pro Pro Ser Gly Phe He Ser Asn He Val Xaa Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Gly Lys Ala Asn Asn Arg Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly^Lys Pro Ser lie His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 



Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 



Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp„ Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala .Lys Asp Val 
740 745 750 

Ser Glu He Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Asn Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2367 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iil MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



ATGAACAAGA ATAATACTAA ATTAAGCACA 


AGAGCCTTAC 


CAAGTTTTAT 


TGATTATTTT 


60 


AATGGCAT-TT- ATGGATTTGC CACTGGTATC 


AAAGACATTA 


TGAACATGAT 


TTTTAAAACG 


120 


GATACAGGTG GTGATCTAAC CCTAGACGAA 


ATTTTAAAGA 


ATCAGCAGTT 


ACTAAATGAT 


180 


ATTTCTGGTA AATTGGATGG GGTGAATGGA 


AGCTTAAATG 


ATCTTATCGC 


ACAGGGAAAC 


240 


TTAAATACAG AATTATCTAA AGAAATATTA 


AAAATTGCAA 


ATGAACAAAA 


TCAAGTTTTA 


300 


AATGATGTTG ATAACAAACT CGATGCGATA 


AATACGATGC 


TTCGGGTATA 


TCTACCTAAA 


360 


ATTACCCTAT GTTGAGTGAT GTAATGAAAC 


AAAATTATGC 


GCTAAGTCTG 


CAAATAGAAT 


420 


ACTTAAGTAA ACAATTGCAA GAGATTTCTG 


ATAAGTTGGA 


TATTATTAAT 


GTAAATGTAC 


480 


TTATTAACTC TACACTTACT GAAATTACAC 


CTGCGTATCA 


AAGGATTAAA 


TATGTGAACG 


540 
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AAAAATTTGA GGAATTAACT TTTGCTACAG AAACTAGTTC AAAAGTAAAA AAGGATGGCT 600 

CTCCTGCAGA TATTCTTGAT GAGTTAACTG AGTTAACTGA ACTAGCGAAA AGTGTAACAA 660 

AAAATGATGT GGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 72 0 

ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCATCGGA ATTAATTACT AAAGAAAATG 780 

TGAAAACAAG TGGCAGTGAG GTCGGAAATG TTTATAACTT CTTAATTGTA TTAACAGCTC 84 0 

TGCAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 90 0 

TTGATTATAC" TTCTATTATG AATGAACATT TAAATAAGGA AAAAGAGGAA TTTAGAGTAA 960 

ACATCCTCCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 1020 

GTGATGAAGA TGCAAAGATG ATTGTGGAAG CTAAACCAGG ACATGCATTG GTTGGGTTTG 10 BO 

AAATTAGTAA TGATTCAATT ACAGTATTAA AAGTATATGA GGCTAAGCTA AAACAAAATT 114 0 

ATCAAGTTGA TAAGGATTCC TTATCGGAAG TTATTTATGG TGATATGGAT AAATTATTGT 12 0 0 

GCCCAGATCA ATCTGAACAA ATCTATTATA CAAATAACAT AGTATTTCCA AATGAATATG 12 6 0 

TAATTACTAA AATTGATTTT ACTAAAAAAA TGAAAACTTT AAGATATGAG GTAACAGCGA 13 20 

ATTTTTATGA TTCTTCTACA GGAGAAATTG ACTTAAATAA GAAAAAAGTA GAATCAAGTG 13 8 0 

AAGCGGAGTA TAGAACGTTA AGTGCTAATG ATGATGGAGT GTATATGCCG TTAGGTGTCA 144 0 

TCAGTGAAAC ATTTTTGACT CCGATTAATG GGTTTGGCCC CCAAGCTGAT GAAAATTCAA 1500 

GATTAATTAC TTTAACATGT AAATCATATT TAAGAAAACT ACTGCTAGCA ACAGACTTTA 156 0 

GCAATAAAGA AACTAAATTG ATCCTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAAA 162 0 

CGGGTCCATA GAAGAGGACA ATTTAGAGCC GGGGAAAGCA AATAATAGGA ATGCGTATGT 16 8 0 

AGATCATACA GGCGGAGTGA ATGGAACTAA AGCTTTATAT GTTCATAAGG ACGGAGGAAT 174 0 

TTCACAATTT ATTGGAGATA AGTTAAAACC GAAAACTGAG TATGTAATCC AATATACTGT 18 0 0 

TAAAGGAAAA CCTTCTATTC ATTTAAAAGA TGAAAATACT GGATATATTC ATTATGAAGA 1860 

TACAAATAAT AATTTAGAAG ATTATCAAAC TATTACTAAA CGTTTTACTA CAGGAACTGA 192 0 

TTTAAAGGGA GTGTATTTAA TTTTAAAAAG TCAAAATGGA GATGAAGCTT GGGGAGATAA 1980 

CTTTATTATT TTGGAAATTA GTCCTTCTGA AAAGTTATTA AGTCCAGAAT TAATTAATAC 204 0 

AAATAATTGG ACGAGTACGG GATCAACTAA TATTAGCGGT AATACACTCA CTCTTTATCA 210 0 

GGGAGGACGA GGAATTCTAA AACAAAACCT TCAATTAGAT AGTTTTTCAA CTTATAGAGT 2160 

GTATTTTTCT GTGTCCGGAG ATGCTAATGT AAGGATTAGA AATTCTAGGG AAGTGTTATT 22 2 0 
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TGAAAAAAGA TATATGAGCG GTGCTAAAGA TGTTTCTGAA ATTTTCACTA CAAAATTTGA 22 8 0 

GAAAGATAAC TTTTATATAG AGCTTTCTCA AGGGAATAAT TTAAATGGTG GCCCTATTGT 23 4 0 

ACATTTTTAC GATGTCTCTA TTAAGTA 23 67 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7B9 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Gly Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
B5 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 
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Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 - 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 2B5 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn lie Val Phe 
405 410 —-415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 



Glu lie Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 



Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 
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He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser lie 
53.0.- 535 — 54 0 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
5B0 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser lie His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He lie Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly lie Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 



Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 
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Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 
" (A) LENGTH; 2369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCCTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 120 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGGGGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAAAT CAAGTCTTAA 3 00 

ATGATGTTAA TAACAAACTC GATGCGATAA ATACGATGCT TCATATATAT CTACCTAAAA 3 60 

TTACATCTAT GTTAAGTGAT GTAATGAAGC AAAATTATGC GCTAAGTCTG CAAATAGAAT 420 

ACTTAAGTAA ACAATTGCAA GAAATTTCTG ATAAATTAGA TATTATTAAC GTAAATGTTC 4 B0 

TTATTAACTC TACACTTACT GAAATTACAC CTGCATATCA ACGGATTAAA TATGTGAATG 54 0 

AAAAATTTGA AGAATTAACT TTTGCTACAG AAACCACTTT AAAAGTAAAA AAGGATAGCT 6 00 

CGCCTGCTGA TATTCTTGAT GAGTTAACTG AATTAACTGA ACTAGCGAAA AGTGTTACAA 660 

AAAATGACGT TGATGGTTTT GAATTTTACC TTAATACATT CCACGATGTA ATGGTAGGAA 72 0 

ATAATTTATT CGGGCGTTCA GCTTTAAAAA CTGCTTCAGA ATTAATTGCT AAAGAAAATG 7 B0 

TGAAAACAAG TGGCAGTGAA GTAGGAAATG TTTATAATTT CTTAATTGTA TTAACAGCTC 84 0 

TACAAGCAAA AGCTTTTCTT ACTTTAACAA CATGCCGAAA ATTATTAGGC TTAGCAGATA 900 

TTGATTATAC TTCTATTATG AATGAACATT TAAATAAGGA AAAAGAGGAA TTTAGAGTAA 960 

ACATCCTTCC TACACTTTCT AATACTTTTT CTAATCCTAA TTATGCAAAA GTTAAAGGAA 102 0 

GTGATGAAGA TGCAAAGATG ATTGTGGAAG CTAAACCAGG ATATGCATTG GTTGGTTTTG 108 0 
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AAATGAGCAA TGATTCAATC ACAGTATTAA 


AAGTATATGA 


GGCTAAGCTA 


AAACAAAATT 


1140 


ATCAAGTTGA TAAGGATTCC TTATCGGAGG 


TTATTTATGG 


TGATACGGAT 


AAATTATTGT 


1200 


GTCCAGATCA ATCTGAACAA ATATATTATA 


CAAATAACAT 


AGTATTTCCA 


AATGAATATG 


1260 


TAATTACTAA AATTGATTTC ACTAAAAAAA 


TGAAAACTTT 


AAGATATGAG 


GTAACAGCGA 


1320 


ATTTTTATGA TTCTTCTACA GG AG AAATTG ACTTAAATAA 


GAAAAAAGTA 


GAATCAAGTG 


1380 


AAGCGGAGTA" TAGAACGTTA AGTGCTAATG 


ATGATGGAGT 


GTATATGCCA 


TTAGGTGTCA 


1440 


TCAGTGAAAC ATTTTTGACT CCGATAAATG 


GGTTTGGCCT 


CCAAGCTGAT 


GGAAATTCAA 


1500 


GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 


1560 


GCAATAAAGA AACTAAATTG ATTGTCCCGC 


CAAGTGGTTT 


TATTAGCAAT 


ATTGTAGAGA 


1620 


ACGGGTCCAT AGAAGAGGAC AATTTAGAGC 


CGTGGAAAGC 


AAATAATAAG 


AATGCGTATG 


1680 


TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA 


TGTTCATAAG 


GACGGAGGAA 


1740 


TTTCACAATT TATTGGAGAT AAGTTAAAAC 


CGAAAACTGA 


GTATGTAATC 


CAATATACTG 


1800 


TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC 


TGGATATATT 


CATTATGAAG 


1860 


ATACAAATAA TAATTTAAAA GATTATCAAA 


CTATTACTAA 


ACGTTTTACT 


ACAGGAACTG 


1920 


ATTTAAAGGG AGTGTATTTA ATTTTAAAAA 


GTCAAAATGG 


AGATGAAGCT 


TGGGGAGATA 


I960 


ACTTTATTAT TTTGGAAATT AGTCCTTCTG 


AAAAGTTATT 


AAGTCCAGAA 


TTAATTAATA 


2040 


CAAATAATTG GACGAGTACG GG ATCAACTC 


ATATTAGCGG 


TAATACACTC 


ACTCTTTATC 


2100 


AGGGAGGACG AGGAATTCTA AAACAAAACC 


TTCAATTAGA 


TAGTTTTTCA ACTTATAGAG 


2160 


TGTATTTTTC TGTGTCCGGA GATGCTAATG 


TAAGGATTAG 


AAATTCTAGG 


GAAGTGTTAT 


2220 


TTGAAAAAAG ATATATGAGC GGTGCTAAAG 


ATGTTTCTGA 


AATGTTCACT 


ACAAAATTTG 


2280 


AGAAAGATAA CTTTTATATA GAGCTTTCTC 


AAGGGAATAA 


TTTATATGGT 


GGTCCTATTG 


2340 


TACATTTTTA CGATGTCTCT ATTAAGTAA 








2369 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:94: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
1 5 10 15 

Tie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu-Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu lie Leu Lys lie Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu lie Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu lie Thr Pro Ala Tyr Gin Arg lie 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Ala Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 



Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 
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Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 

290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 

305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 

325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu_Ala Lys 

340- '"*"*" 345 „ 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser He Thr 

355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 

370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 

385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn lie Val Phe 

405 410 415 



Pro Asn Glu Tyr Val lie Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arc Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
4B5 490 495 

Asp Glu Asn Ser Arg Leu He -Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 
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Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
5B0 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 * 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He lie Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly lie Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 



Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro lie Val His Phe Tyr 
770 775 780 



Asp Val Ser lie Lys 
785 



INFORMATION FOR SEQ ID NO ; 95 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2370 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
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TTGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 
AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 12 0 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 
ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360__ 
ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 
TCTCCTGCAG ATATTCTTGA TGAGTTAGCT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 
AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 7 80 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 9'00 
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 12 00 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 1320 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 80 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 1440 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATTGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 
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GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 17 4 0 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860 

GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 192 0 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2 04 0 

ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2 22 0 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 22 8 0 

GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 23 4 0 

GTACATTTTT ACGATGTCTC TATTAAGTAA 2 3 70 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 96: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 
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Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala lie Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
..145 150 - " 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 265 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu_.Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 



Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser lie Thr 
355 360 365 



Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 
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Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin lie Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 . 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Asn Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser lie 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 



Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 



Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Leu 



Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 



Ser 



Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 .695 700 



Gly 
705 



He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
710 715 720 



val 



Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 



Arg 



Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser 



Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr lie Glu 
755 760 765 



Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 



Asp Val Ser He Lys 
785 

{2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2374 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 3 60 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 
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GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 66 0 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 7 BO 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT "~ 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 96 0 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 10 80 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 12 00 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 126 0 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 13 2 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAACGT CGAATCAAGT 13 8 0 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 14 4 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 15 60 

AGCAATAAAG AAACTAAATT GATGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 16 20 

ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CGTGGAAAGC AAATAATAAG AATGCGTATG 16 80 

TAGATCATAC AGGCGGAGTG AATGGAACTA AAGCTTTATA TGTTCATAAG GACGGAGGAA 174 0 

TTTCACAATT TATTGGAGAT AAGTTAAAAC CGAAAACTGA GTATGTAATC CAATATACTG 1800 

TTAAAGGAAA ACCTTCTATT CATTTAAAAG ATGAAAATAC TGGATATATT CATTATGAAG 1860 

ATACAAATAA TAATTTAGAA GATTATCAAA CTATTAATAA ACGTTTTACT ACAGGAACTG 192 0 

ATTTAAAGGG AGTGTATTTA ATTTTAAAAA GTCAAAATGG AGATGAAGCT TGGGGAGATA 198 0 

ACTTTATTAT TTTGGAAATT AGTCCTTCTG AAAAGTTATT AAGTCCAGAA TTAATTAATA 204 0 

CAAATAATTG GACGAGTACG GGATCAACTA ATATTAGCGG TAATACACTC ACTCTTTATC 2100 

AGGGAGGACG AGGGATTCTA AAACAAAACC TTCAATTAGA TAGTTTTTCA ACTTATAGAG 2160 

TGTATTTTTC TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT 22 2 0 
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TTGAAAAAAG ATATATGAGC GGTGCTAAAG ATGTTTCTGA AATGTTCACT ACAAAATTTG 22 80 
AGAAAGATAA CTTTTATATA GAGCTTTCTC AAGGGAATAA TTTATATGGT GGTCCTATTG 234 0 
TACATTTTTA CGATGTCTCT ATTAAGTAAC CCAA 2374 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids - 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys lie Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu His He Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Xaa Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 



WO 98/00546 



PCT/US97/11658 



154 

Thr Leu Lys Val Lys Lys Asp Ser Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Ly_s_T.hr Ala Ser Glu Leu lie 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Xaa Lys Leu Leu Gly Leu Ala Asn He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Aan lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser lie Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp-Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 
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He Ser Glu Thr Phe Leu Thr Xaa lie Xaa Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

"Val Pro Pro Ser Gly Phe He Ser Asn lie Vai Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 



Lys Asp Gly Gly Phe Ser Gin Phe He Gly Asp Xaa Leu Lys Pro Lys 
580 5B5 590 

Thr Glu Tyr Xaa He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675. 680 685 

Ser Thr His He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 7 15 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 



Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 
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Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro lie Val His Phe Tyr 

770 775 780 

Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2366 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CGAGTTTTAT TGATTATTTT 6 0 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 12 0 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 18 0 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAA TCAAGTCTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 3 60 

ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 42 0 

TACTTAAGTA AACAATTGCA GAATTTCTGA TAAATTAGAT ATTATTAACG TAAATGTTCT 4 80 

TATTAACTCT ACACTTACTG AAATTACACC TGCATATCAA CGGATTAAAT ATGTGAAGAA 54 0 

AAATTTGAAG AATTAACTTT TGCTACAGAA ACCACTTTAA AAGTAAAAAA GGATAGCTCG 600 

CCTGCTGATA TTCTTGATGA GTTAACTGAA TTAACTGAAC TAGCGAAAAG TGTTACAAAA 660 

AATGACGTTG ATGGTTTTGA ATTTTACCTT AATACATTCC ACGATGTAAT GGTAGGAAAT 72 0 

AATTTATTCG GGCGTTCAGC TTTAAAAACT GCTTCAGAAT TAATTGCTAA AGAAAATGTG 7 80 

AAAACAAGTG GCAGTGAAGT AGGAAATGTT TATAATTTCT TAATTGTATT AACAGCTCTA 84 0 

CAAGCAAAAG CTTTTCTTAC TTTAACAACA TGCCAAAATT ATTAGGCTTA GCAAATATTG 900 

ATTATACTTC TATTATGAAT GAACATTTAA ATAAGGAAAA AGAGGAATTT AGAGTAAACA 960 

TCCTTCCTAC ACTTTCTAAT ACTTTTTCTA ATCCTAATTA TGCAAAAGTT AAAGGAAGTG 1020 

ATGAAGATGC AAAGATGATT GTGGAAGCTA AACCAGGATA TGCATTGGTT GGTTTTGAAA 1080 
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TGAGCAATGA 


TTCAATCACA 


GTATTAAAAG 


TATATGAGGC 


TAAGCTAAAA 


CAAAATTATC 


1140 


AAnTTGATAA 


GGATTCCTTA 


TCGGAGGTTA 


TTTATGGTGA 


TACGGATAAA 


TTATTGTGTC 


1200 




TGAACAAATA 

I UrvA\u Ann * n 


TATTATACAA 


ATAACATAGT 


ATTTCCAAAT 


GAATATGTAA 


1260 


TTDTTAAAAT 


TG ATT TC ACT 


AAAAAAATGA 


AAACTTTAAG 


ATATGAGGTA 


ACAGCGAATT 


1320 


TTTnTOATTP 


TTCTACAGGA 


GAAATTGACT 


TAAATAAGAA 


AAAAGTAGAA 


TCAAGTGAAG 


1380 


LuunU inl MO 


AACGTTAAGT 


GCTAATGATG 


ATGGAGTGTA 


TATGCCATTA 


GGTGTCATCA 


1440 




TTTGAPTPGA 


TTATGGGTTT 


GGCCTCCAAG 


CTGATGGAAA 


TTCAAGATTA 


1500 


nl 1 rtl ill MM 


PATGTAAATC 

V»M 1 w 1 MMM a v_ 


ATATTTAAGA 


GAACTACTGC 


TAGCAACAGA 


CTTAAGCAAT 


1560 


AAAoAAAL I A 


MM I 1 UM 1 iUl 


CCCCCAAGTG 


GTTTTATTAG 


CAATATTGTA 


GAGAACGGGT 


1620 


L L A i AoAAvjA 


GGACAATTTA 


GAGCCGTGGA AAGCAAATAA 


TAAGAATGCG 


TATGTAGATC 


1680 


A i AlAubLuo 




ACTAAAGCTT 


TATATGTTCA 


TAAGGACGGA 


GGATTTTCAC 


1740 


jv atttatthg 

MA I I 1H1 1 uo 


AGATAATTAA 

MsJM l mm x i.nn 


AACCGAAAAC 


TGAGTATTAA 


TCCAATATAC 


TGTTAAAGGA 


1800 


AAACCTTCTA 


TTCATTTAAA 


AGATGAAAAT 


ACTGGATATA 


TTCATTATGA 


AGATACAAAT 


1860 


AATAATTTAA 


AAGATTATCA 


AACTATTACT 


AAACGTTTTA 


CTACAGGAAC 


TGATTTAAAG 


1920 


GGAGTGTATT 


TAATTTTAAA 


AAGTCAAAAT 


GGAGATGAAG 


CTTGGGGAGA 


TAACTTTATT 


1980 


ATTTTGGAAA 


TTAGTCCTTC 


TGAAAAGTTA 


TTAAGTCCAG 


AATTAATTAA 


TACAAATAAT 


204 0 


TGGACGAGTA 


CGGGATCAAC 


TCATATTAGC 


GGTAATACAC 


TCACTCTTTA 


TCAGGGAGGA 


2100 


CGAGGAATTC 


TAAAACAAAA 


CCTTCAATTA 


GATAGTTTTT 


CAACTTATAG 


AGTGTATTTT 


2160 


TCTGTGTCCG GAGATGCTAA TGTAAGGATT AGAAATTCTA GGGAAGTGTT 


ATTTGAAAAA 


2220 


AGATATATGA 


GCGGTGCTAA 


AGATGTTTCT 


GAAATGTTCA 


CTACAAAATT 


TGAGAAAGAT 


2280 


AACTTTTATA 


TAGAGCTTTC 


TCAAGGGAAT 


AATTTATATG 


GTGGTCCTAT 


TGTACATTTT 


2340 


TACGATGTCT 


CTATTAAGTA 


ACCCAA 








2366 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : protein 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 100: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
15 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

He Met Asn Met lie Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 

3 5 4 0 45__- 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 80 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala He Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys He Thr Phe Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser Asp Lys Leu Asp lie lie Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu lie Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 • 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 
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Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val-Lys Gly Ser Asp Glu Asp Ala Lys Met He Val._.Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu He 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 



Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Xaa Asn Xaa Asn Ala Tyr 
545 550 555 560 



Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 
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Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Lys Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Xaa___Xaa Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu lie Leu Lys Ser Gin Asn Gly Xaa Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Xaa Leu He Asn Thr Xaa Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr A6n He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 7O0 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Xaa Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Xaa Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 



Asp Val Ser He Lys 
785 



) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2362 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 101: 
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ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 12 0 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA "360 

ATTACCTTTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 42 0 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 78 0 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG GTTAGCAGAT • 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 12 00 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 12 60 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 13 20 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 8 0 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 14 4 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCT CCAAGCTGAT GAAAATTCAA 1500 

GATTAATTAC TTTAACATGT AAATCATATT TAAGAGAACT ACTGCTAGCA ACAGACTTAA 1560 

GCAATAAAGA AACTAAATTG ATCGTCCCGC CAAGTGGTTT TATTAGCAAT ATTGTAGAGA 1620 

ACGGGTCCAT AGAAGAGGAC AATTTAGAGC CCTGGAAAGC AATAATAGAA TGCGTATGTA 1680 
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GATCATACAG GCGGAGTGAA TGGAACTAAA GCTTTATATG TTCATAAGGA CGGAGGAATT 174 0 

TCACAATTTA TTGGAGATAA GTTAAAACCG AAAACTGAGT ATGTAATCCA ATATACTGTT 18 00 

AAAGGAAAAC CTTCTATTCA TTTAAAAGAT GAAAATACTG GATATATTCA TTATGAAGAT 1860 

ACAAATAATA ATTTAAATTA TCAAACTATT AATAAACGTT TTACTACAGG AACTGATTTA 192 0 

AAGGGAGTGT ATTTAATTTT AAAAAGTCAA AATGGAATGA AGCTTGGGGA GATAACTTTA 1980 

— TTATTTTGGA AATTAGTCCT TCTGAAAAGT TATTAAGTCC AAATTAATTA ATACAATAAT 2 04 0 

TGGACAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT CAGGGAGGAC 2100 

GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTCA ACTTATAGAG TGTATTTTTC 2160 

TGTGTCCGGA GATGCTAATG TAAGGATTAG AAATTCTAGG GAAGTGTTAT TTGAAAAAAG 222 0 

ATATATGAGC GGTGCTAAAA TGTTTCTGAA ATGTTCACAC AAAATTTGAG AAAGATAACT 22 80 

TTTATATAGA GCTTTCTCAA GGGAATAATT TATATGGTGG TCCTATTGTA CATTTTTACG 2340 

ATGTCTCTAT TAAGTAACCC AA 2 3 62 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 790 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met His Glu Asn Asn Thr Lys Leu Ser Ala Arg Ala Leu Pro Ser Phe 
! 5 10 15 

lie Asp Tyr Phe Asn Gly He Tyr Gly Phe Ala Thr Gly He Lys Asp 
20 25 30 

He Met Asn Met He Phe Lys Thr Asp Thr Gly Gly Asn Leu Thr Leu 
35 40 45 

Asp Glu He Leu Lys Asn Gin Gin Leu Leu Asn Glu He Ser Gly Lys 
50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu He Ala Gin Gly Asn 
65 70 75 B0 

Leu Asn Thr Glu Leu Ser Lys Glu He Leu Lys He Ala Asn Glu Gin 
85 90 95 
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Ser Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala lie Asn Thr 
100 105 110 

Met Leu His lie Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin He Glu Tyr Leu Ser Lys 
130 135 140 

Gin Leu Gin Glu He Ser— Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu lie Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 



Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185' 190 

Thr Leu Lys Val Lys Lys Asp Xaa Ser Pro Ala Asp He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Giy Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Ala Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 2B0 265 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly Tyr Ala Leu Val Gly Phe Glu Met Ser Asn Asp Ser He Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 



WO 98/00546 



PCT/US97/11658 



164 

Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Thr Asp Lys Leu Leu 
385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 
405 410 415 

Pro Asn Glu Tyr Val He Thr Lys He Asp Phe Thr Lys Lys Met Lys 
420 425 430 

Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu He Asp Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Arg Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

He Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 490 495 

Asp Gly Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Lys Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu lie 
515 520 525 

Val Pro Pro Ser Gly Phe lie Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Lys Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Xaa Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val lie Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu Lys Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Lys Asp Tyr Gin Thr He Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 



Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 



Ala Trp Gly Asp Asn Phe He lie Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 
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Leu Leu Ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr His lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 

Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 _ 730 — ' 7 35 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe Tyr 
770 775 780 

Asp Val Xaa He Lys Pro 
785 790 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
ATGCACGAGA ATAATACTAA ATTAAGCGCA AGGGCCTTAC CGAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAATATGAT TTTTAAAACG 12 0 

GATACAGGTG GTAATCTAAC CTTAGATGAA ATCCTAAAGA ATCAGCAGTT ACTAAATGAG 180 

ATTTCTGGTA AATTGGATGG GGTAAATGGG AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATCTTA AAAATTGCAA ATGAACAGAG TCAAGTTTTA 3 00 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCATATATA TCTACCTAAA 360 

ATTACATCTA TGTTAAGTGA TGTAATGAAG CAAAATTATG CGCTAAGTCT GCAAATAGAA 42 0 

TACTTAAGTA AACAATTGCA AGAAATTTCT GATAAATTAG AT ATT ATT AA CGTAAATGTT 4 80 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCATATC AACGGATTAA ATATGTGAAT 54 0 
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GAAAAATTTG AAGAATTAAC TTTTGCTACA GAAACCACTT TAAAAGTAAA AAAGGATRAC 60 0 

TCGCCTGCTG ATATTCTTGA TGAATTAACT GAATTAACTG AACTAGCGAA AAGTGTTACA 66 0 

AAAAATGACG TTGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 72 0 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCTTCAG AATTAATTGC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA AGTAGGAAAT GTTTATAATT TCTTAATTGT ATTAACAGCT 84 0 

CTACAAGCAA * AAGCTTTTCT TACTTTAACA ACATGCCGAA" AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 96 0 

AACATCCTTC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 102 0 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GATATGCATT GGTTGGTTTT 1080 

GAAATGAGCA ATGATTCAAT CACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTTG ATAAGGATTC CTTATCGGAG GTTATTTATG GTGATACGGA TAAATTATTG 1200 

TGTCCAGATC AATCTGAACA AATATATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 126 0 

GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 13 80 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC ATTAGGTGTC 144 0 

ATCAGTGAAA CATTTTTGAC TCCGATAAAT GGGTTTGGCC TCCAAGCTGA TGGAAATTCA 15 0 0 

. AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAAAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 162 0 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 1680 

GTAGATCATA CAGGCGGAGT GAAAGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

ATTTCACAAT TTATTGGAGA TAAKTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA I860 

GATACAAATA ATAATTTAAA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 192 0 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

- AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2 04 0 

ACAAATAATT GGACGAGTAC GGGATCAACT CATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT C TGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2 22 0 
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TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 22 80 

GAGAAAGATA ACTT7TATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 234 0 

GTGCATTTTT ACGATGTCYC TATTAAGTAA CCCAA 23 7 5 

(2) INFORMATION FOR SEQ ID NO:104: 

(i) SEQUENCE CHARACTERISTICS: — — 

(A) LENGTH: 554 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 104 : 

Thr Leu His Leu Leu Lys Leu His Leu Arg lie Lys Gly Leu Asn Met 
! 5 10 15 

Thr Lys Asn Leu Arg Asn Leu Leu Leu Xaa Xaa Leu Xaa Gin Lys Lys 
20 25 30 

Arg Met Ala Leu Leu Gin He Phe Xaa Met Ser Leu Ser Xaa Asn Arg 
35 40 45 

Lys Val Gin Lys Met Met Trp Met Val Leu Asn Phe Thr Leu lie His 
50 55 60 

Ser Thr Met Xaa Glu lie He Tyr Ser Gly Val. Gin Leu Lys Leu Xaa 
65 70, 75 80 

Arg Asn Leu Leu Lys Lys Met Lys Gin Val Ala Val Xaa Xaa Glu Met 
85 90 95 

Phe lie Xaa Ser Leu Tyr Gin Leu Xaa Lys Gin Lys Leu Phe Leu Leu 
100 105 HO 

Gin His Ala Glu Asn Tyr Xaa Gin lie Leu lie lie Leu Leu Leu Met 
115 120 125 

Asn lie He Arg Lys Lys Arg Asn Leu Glu Thr Ser Xaa Leu His Phe 
130 135 140 

Leu lie Leu Phe Leu lie Leu He Met Gln^Lys Leu Lys Glu Val Met 
145 150 155 160 

Lys Met Gin Arg Leu Trp Lys Leu Asn Gin Asp Met His Trp Leu Val 
165 170 175 

Leu Lys Ala Met He Gin Ser Gin Tyr Lys Tyr Met Arg Leu Ser Asn 
180 185 190 
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Lys lie lie Lys Leu lie Arg He Pro Tyr Arg Arg Leu Phe Met Val 
195 200 205 

lie Arg He Asn Tyr Cys Val Gin He Asn Leu Asn Lys Tyr lie lie 
210 215 220 

Gin He Thr Tyr Phe Gin Met Asn Met Leu Leu Lys Leu He Ser Leu 
225 230 235 240 

Lys Lys Lys Leu ABp Met Arg Gin Arg He Phe Met He Leu Leu Gin 
245 • ~~ 250 255 

Glu Lys Leu Thr He Arg Lys Lys Asn Gin Val Lys Arg Ser He Glu 
260 265 270 

Arg Val Leu Met Met Met Xaa Cys He Cys His Val Ser Ser Val Lys 
275 280 285 

His Phe Leu Arg Met Gly Leu Ala Ser Lys Leu Arg Gin lie Gin Asp 
290 295 300 

Leu Leu His Val Asn His He Glu Asn Tyr Cys Gin Gin Thr Ala lie 
305 310 315 320 

Arg Lys Leu Asn Ser Ser Arg Gin Val Phe Tyr Gin Tyr Cys Arg Glu 
325 330 335 

Arg Val Leu Arg Arg Gly Gin Phe Arg Ala Val Glu Ser Lys Glu Cys 
340 345 350 

Val Cys Arg Ser Tyr Arg Arg Ser Glu Trp Asn Ser Phe He Cys Ser 
355 360 365 

Gly Arg Arg Asn Phe Thr lie Tyr Trp Arg Val Lys Thr Glu Asn Val 
370 • 375 380 

Cys Asn Pro lie Tyr Cys Arg Lys Thr Phe Tyr Ser Phe Lys Arg Lys 
385 390 395 400 

Tyr Trp lie Tyr Ser Leu Arg Tyr Lys Phe Lys Arg Leu Ser Asn Tyr 
405 410 415 



Tyr Thr Phe Tyr Tyr Arg Asn Phe Lys Gly Ser Val Phe Asn Phe Lys 
420 425 430 

Lys Ser Lys Trp Arg Ser Leu Gly Arg Leu Tyr Tyr Phe Gly Asn Ser 
435 440 445 

Phe Lys Val He Lys Ser Arg lie Asn Tyr Lys Leu Asp Glu Tyr Gly ■ 
450 455 460 

He Asn Ser Tyr Arg Tyr Thr His Ser Leu Ser Gly Arg Thr Arg Asn 
465 470 475 480 
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Ser Lys Thr Lys Pro Ser lie Arg Phe Phe Asn Leu Ser Val Phe Phe 
485 490 495 

Cys Val Arg Arg Cys Cys Lys Asp Lys Phe Gly Ser Val He Lys Lys 
500 505 510 

He Tyr Glu Arg Cys Arg Cys Phe Asn Val His Tyr Lys He Glu Arg 

515 520 525 

Leu Leu Tyr Arg Ala Phe Ser Arg Glu Phe He Trp Trp Ser Tyr Cys 

530 535 540^ 

Thr Phe Leu Arg Cys Leu Tyr Val Thr Gin 



(2) INFORMATION FOR SEQ ID NO:105: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1B88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

ACTCTACACT TACTGAAATT ACACCTGCGT ATCAAAGGAT TAAATATGTG AACGAAAAAT 60 

TTGAGGAATT AACTTTTGCT ACRGAMACTA KTTCAAAAGT AAAAAMGGAT GGCTCTCCTS 120 

CAGATATTCT KGATGAGTTA ACTGAGTTAA CWGAACTAGC GAAAAGTGTA ACAAAAAATG 18 0 

ATGTGGATGG TTTTRAATTT TACCTTAATA CATTCCACGA TGTAAKGGTA GGAAATAATT 24 0 

TATTCGGGCG TTCAGCTTTA AAAACTGCWT CGGAATTAAT TRCTAAAGAA AATGTGAAAA 3 00 

CAAGTGGCAG TGARGTMGGA AATGTTTATA AYTTCTTAAT TGTATTAACA GCTCTRCAAG 360 

CAAAAGCTTT TCTTACTTTA ACAACATGCC GAAAATTATT AGGSTTAGCA GATATTGATT 420 

ATACTTCTAT TATGAATGAA CATTTAAATA AGGAAAAAGA GGAATTTAGA GTAAACATCC 4 80 

TYCCTACACT TTCTAATACT TTTTCTAATC CTAATTATGC AAAAGTTAAA GGAAGTGATG 540 

AAGATGCAAA GATGATTGTG GAAGCTAAAC CAGGATATGC ATTGGTTGGT TTTGAAATGA 600 

GCAATGATTC AATCACAGTA TTAAAAGTAT ATGAGGCTAA GCTAAAACAA AATTATCAAG 660 

TTGATAAGGA TTCCTTATCG GAGGTTATTT ATGGTGATAC GGATAAATTA TTGTGTCCAG 72 0 

ATCAATCTGA ACAAATATAT TATACAAATA ACATAGTATT TCCAAATGAA TATGTAATTA 7 60 

CTAAAATTGA TTTCACTAAA AAAATGAAAA CTTTAAGATA TGAGGTAACA GCGAATTTTT 84 0 
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ATGATTCTTC TACAGGAGAA 
AGTATAGAAC GTTAAGTGCT 
AAACATTTTT GACTCCGATA 
TTACTTTAAC ATGTAAATCA 
AGGAAACTAA ATTGATCTTC 
CTTAGAAGAG GACAATTTAG 
TACAGGCGGA GTGAATGGAA 
ATTTATTGGA GATAAGTTAA 
AAAACCTTCT ATTCATTTAA 
TAATAATTTA AAAGATTATC 
GGGAGTGTAT TTAATTTTAA 
TATTTTGGAA ATTAGTCCTT 
TTGGACGAGT ACGGGATCAA 
ACGAGGAATT CTAAAACAAA 
TTCTGTGTCC GGAGATGCTA 
AAGATATATG AGCGGTGCTA 
TAACTTTTAT ATAGAGCTTT 
TTACGATGTC TCTATTAAGT 
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ATTGACTTAA ATAAGAAAAA AGTAGAATCA AGTGAAGCGG 900 

AATGATGATG GRGTGTATAT GCCATTAGGT GTCATCAGTG 960 

AATGGGTTTG GCCTCCAAGC TGAGGCAAAT TCAAGATTAA 1020 

TATTTAAGAG AACTACTGCT AGCAACAGAC TTAAGCAATW 1080 

CCGCCAAGTG TTTTATTAGC AATATTGTAG AGAACGGGTC 114 0 

AGCCGTGGAA AGCAAATAAT AAGAATGCGT"""ATGTAGATCA 1200 

CTAAAGCTTT ATATGTTCAT AAGGACGGAG GAATTTCACA 1260 

AACCGAAAAC TGAGTATGTA ATCCAATATA CTGTTAAAGG 1320 

AAGATGAAAA TACTGGATAT ATTCATTATG AAGATACAAA 13 80 

AAACTATTAC TAAACGTTTT ACTACAGGAA CTGATTTAAA 14 4 0 

AAAGTCAAAA TGGAGATGAA GCTTGGGGAG ATAACTTTAT 1500 

CTGAAAAGTT ATTAAGTCCA GAATTAATTA ATACAAATAA 1560 

CTCATATTAG CGGTAATACA CTCACTCTTT ATCAGGGAGG 1620 

ACCTTCAATT AGATAGTTTT TCAACTTATA GAGTGTATTT 1680 

ATGTAAGGAT TAGAAATTCT AGGGAAGTGT TATTTGAAAA 174 0 

AAGATGTTTC TGAAATGTTC ACTACAAAAT TTGAGAAAGA 1800 

CTCAAGGGAA TAATTTATAT GGTGGTCCTA TTGTACATTT 1860 

AACCCAAA 1888 
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Claims 

1 1 A biologically pure culture of a Bacillus thuringiensis isolate active against a non- 

2 mammalian pest, wherein said isolate is selected from the group consisting of PS 1 85U2, PS 1 1B^ 

3 PS2 1 8G2, PS213E5, PS28C, PS86BB 1 , PS89J3 , PS94R 1 , PS27J2, PS202S, PS101DD, PS31G1, 

4 and mutants thereof. 



1 2. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin active against a non-mammalian pest, wherein said nucleotide sequence 

3 hybridizes with a nucleotide sequence which encodes an amino acid sequence selected from the 

4 group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, SEQ ID NO. 13, SEQ ID 

5 NO. 15.SEQIDNO. 17.SEQIDNO. 19, SEQ ID NO. 21, SEQ ID NO. 23. SEQ ID NO. 25, 

6 SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1 , SEQ ID NO. 33, SEQ ID NO. 35, SEQ ID 

7 NO. 37, SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43, SEQ ID NO. 45, SEQ ID NO. 47, 

8 SEQ ID NO. 49, SEQ ID NO. 5 1 , SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID 

9 NO. 59, and SEQ ID NO. 61. 

1 3. The isolated polynucleotide, according to claim 2, comprising a nucleotide sequence 

2 which hybridizes with a nucleotide sequence selected from the group consisting of SEQ ID NO. 

3 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 18, SEQ ID 

4 NO. 20, SEQ ID NO. 22, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID NO. 28, SEQ ID NO. 30, 

5 SEQ ID NO. 32, SEQ ID NO. 34, SEQ ID NO. 36, SEQ ID NO. 38, SEQ ID NO. 40, SEQ ID 

6 NO. 42, SEQ ID NO. 44, SEQ ID NO. 46, SEQ ID NO. 48, SEQ ID NO. 50, SEQ ID NO. 52, 

7 SEQ ID NO. 54, SEQ ID NO. 56, SEQ ID NO. 58, SEQ ID NO. 60, and SEQ ID NO. 62. 

1 4. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin which is active against a non-mammalian pest wherein said nucleotide sequence 

3 encodes a toxin which comprises an amino acid sequence having at least about 75% homology 

4 with a sequence selected from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID 

5 NO. 11, SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, 

6 SEQ ID NO. 23, SEQ ID NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 31 , SEQ ID 

7 NO. 33, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID NO. 41, SEQ ID NO. 43, 

8 ' SEQ ID NO. 45 , SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1 . SEQ ID NO. 53 , SEQ ID 

9 NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 61 . 
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1 5. The isolated polynucleotide, according to claim 4,/wherem said nucleotide sequence 

2 encodes a toxin which comprises an amino acid sequence having at least about 90% homology 

3 with a sequence selected from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID 

4 NO. 1 1 , SEQ ID NO. 1 3, SEQ ID NO. 1 5, SEQ ID NO. 1 7, SEQ ID NO. 19, SEQ ID NO. 21, 

5 SEQ ID NO. 23, SEQ ID NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1 , SEQ ID 
.6- NO. 33, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 39, SEQID NO. 41, SEQ ID NO. 43, 

7 SEQ ID NO. 45, SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 51, SEQ ID NO. 53, SEQ ID 

8 NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 61 . 

1 6. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin active against a non-mammalian pest wherein said toxin immunoreacts with an 

3 antibody to a toxin from a B.t. isolate selected from the group consisting of PS185U2, PS1 IB, 

4 PS2 1 8G2, PS2 1 3E5, PS86W 1 , PS28C, PS86BB 1 , PS89J3, PS86 V 1 , PS94R 1 , HD525, HD573A, 

5 PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, and PS31G1. 

1 7. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin active against a non-mammalian pest wherein said toxin has at least about 75% 

3 homology with a toxin from a B.t. isolate selected from the group consisting of PS185U2, 

4 PS1 IB, PS218G2, PS213E5, PS86W1, PS28C, PS86BB1, PS89J3, PS86V1, PS94R1 , HD525, 

5 HD573A, PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, and PS31G1. 

1 8. The isolated polynucleotide, according to claim 7, wherein said nucleotide sequence 

2 encodes a protein which has at least about 90% homology with a toxin from a B.t. isolate 

3 selected from the group consisting of PS185U2, PS1 IB, PS218G2, PS213E5, PS86W1, PS28C, 

4 PS86BB1, PS89J3, PS86V1, PS94RI, HD525, HD573A, PS27J2, HD110, HD10, PS202S, 

5 HD29,PS101DD,HD129,andPS31Gl. 

1 9. An isolated nucleotide sequence selected from the group consisting of SEQ ID NO. 

2 1, SEQ ID NO. 2, SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 

3 63, SEQ ID NO. 64, SEQ ID NO. 65, SEQ ID NO. 66, SEQ ID NO. 67, SEQ ID NO. 68, and 

4 SEQ ID NO. 69. 
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1 10. A purified or recombinant toxin having activity against a non-mammalian pest, 

2 wherein said toxin comprises an amino acid sequence encoded by a nucleotide sequence which 

3 hybridizes with a nucleotide sequence which encodes an amino acid sequence selected from the 

4 group cons.sting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, SEQ ID NO. 13, SEQ ID 

5 NO. 15, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 23, SEQ ID NO. 25, 

6 SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1 , SEQ ID NO. 33, SEQ ID NO. 35, SEQ ID 

7 NO. 37, SEQ ID NO. 39, SEQ ID NO. 41. SEQ ID NO. 43, SEQ ID NO. 45, SEQTD NO. 47, 

8 SEQ ID NO. 49, SEQ ID NO. 51, SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID 

9 NO. 59, and SEQ ID NO. 61. 

1 1 1 . The toxin, according to claim 10, wherein said toxin comprises an amino acid 

2 sequence encoded by a nucleotide sequence which hybridizes with a nucleotide sequence 

3 selected from the group consisting of SEQ ID NO. 8, SEQ ID NO. 10.SEQIDNO. 12,SEQID 

4 NO. 14, SEQ ID NO. 16, SEQ ID NO. 1 8, SEQ ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 24, 

5 SEQ ID NO. 26, SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 32, SEQ ID NO. 34, SEQ ID 

6 NO. 36, SEQ ID NO. 38, SEQ ID NO. 40, SEQ ID NO. 42, SEQ ID NO. 44, SEQ ID NO. 46, 

7 SEQ ID NO. 48, SEQ ID NO. 50, SEQ ID NO. 52, SEQ ID NO. 54, SEQ ID NO. 56, SEQ ID 

8 NO. 58, SEQ ID NO. 60, and SEQ ID NO. 62. 

1 12. The toxin, according to claim 1 1 , wherein said toxm comprises an amino acid 

2 sequence selected from the group cons.sting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1 , 

3 SEQ ID NO. 1 3, SEQ ID NO. 1 5, SEQ ID NO. 1 7, SEQ ID NO. 1 9, SEQ ID NO. 21, SEQ ID 

4 NO. 23, SEQ ID NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1 , SEQ ID NO. 33, 

5 SEQ ID NO. 35, SEQ ID NO: 37, SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43, SEQ ID 

6 NO. 45, SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1 , SEQ ID NO. 53, SEQ ID NO. 55, 

7 SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO 61. 

1 13. A purified or recombinant toxin having activity against a non-mammalian pest, 

2 wherein said toxin comprises an amino acid sequence having at least about 75% homology with 

3 a sequence selected from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 

4 1 1, SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, SEQ 

5 ID NO. 23, SEQ ID NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO 3 1 , SEQ ID NO. 

6 33. SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID NO. 41, SEQ ID NO. 43, SEQ 
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7 ID NO. 45, SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1, SEQ ID NO. 53, SEQ ID NO. 

8 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 61 

1 14. The toxin, according to claim 13, which comprises an amino acid sequence having 

2 at least about 90% homology with a sequence selected from the group consisting of SEQ ED NO. 

3 7, SEQ ID NO. 9, SEQ ID NO. 11, SEQ ID NO. 13.SEQIDNO. 15.SEQIDNO. 17.SEQID 

4 NO. 1 9, SEQ ID NO. 2 1 , SEQ ID NO. 23.SEQ ID NO. 25,~SEQ ID NO. 27, SEQ.IDNO. 29, 

5 SEQ ID NO. 3 1, SEQTDNO. 33, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID 

6 NO. 41, SEQ ID NO. 43, SEQ ID NO. 45, SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 51, 

7 SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 61 . 

1 15. The toxin, according to claim 13, which comprises an amino acid sequence selected 

2 from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 11, SEQ ID NO. 13, 

3 SEQ ID NO. 15, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 23, SEQ ID 

4 NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 31, SEQ ID NO. 33, SEQ ID NO. 35, 

5 SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43, SEQ ID NO. 45, SEQ ID 

6 NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1, SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, 

7 SEQ ID NO. 59, and SEQ ID NO. 61. 

1 16. A purified or recombinant toxin having activity against a non-mammalian pest 

2 wherein said toxin immunoreacts with an antibody to a toxin from a B.t. isolate selected from 

3 the group consisting of PS185U2, PS1 IB, PS218G2, PS213E5, PS86W 1 , PS28C, PS86BB1, 

4 PS89J3, PS86V1, PS94R1, HD525, HD573A, PS27J2, HD1 10, HD10, PS202S, HD29, 

5 PS101DD,HD129,andPS31Gl. 

1 17. A purified or recombinant toxin having activity against a non-mammaltan pest 

2 wherein said toxin has at least about 75% homology with a toxin from a B.t. isolate selected 

3 from the group consisting of PS185U2, PS11B, PS218G2, PS213E5, PS86W1, PS28C, 

4 PS86BB1, PS89J3, PS86V1, PS94R1, HD525, HD573A, PS27J2, HD110, HD10, PS202S, 

5 HD29,PS101DD,HD129,andPS3)Gl. 



1 18. The toxin, according to claim 17, wherein said toxin has at least about 90% 

2 homology with a toxin from a B.t. isolate selected from the group consisting of PS185U2, 
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3 PS1 IB, PS218G2, PS213E5, PS86W1, PS28C, PS86BB1, PS89J3, PS86V1, PS94RI, HD525, 

4 HD573A, PS27J2, HD110, HDIO, PS202S, HD29, PS101DD, HD129, and PS31G1. 

] 19. A recombinant host transformed with a polynucleotide of claim 2. 

1 20. The recombinant host, according to claim 1 9, wherein said transformed host is a 

2 bacterium. - 

1 21. The recombinant host, according to claim 20, wherein said bacterium is selected 

2 from the group consisting of E. coli, Pseudomonas, and Bacillus thuringiensis . 

1 22. The recombinant host, according to claim 19, wherein said transformed host is a 

2 plant. 

1 23. A composition of matter for controlling a non-mammalian pest, wherein said 

2 composition comprises a Bacillus thuringiensis isolate selected from the group consisting of 

3 PS185U2, PS1 IB, PS218G2, PS213E5, PS28C, PS86BB1, PS89J3, PS94R1, PS27J2, PS202S, 

4 PS101DD, PS31G1, and mutants thereof, or a toxin therefrom, in association with an 

5 agricultural carrier. 

1 24. A method for controlling a non-mammalian pest which comprises contacting said 

2 pest with a toxin from a Bacillus thuringiensis isolate selected from the group consisting of 

3 PS185U2,PS11B,PS218G2,PS213E5,PS86W1,PS28C,PS86BB1,PS89J3,PS86V1,PS94R1, 

4 HD525, HD573A, PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, and PS31G1 . 

1 25. A method for the control of a non-mammalian pest which comprises contacting said 

2 pest with a pesticidal amount of a Bacillus thuringiensis toxin wherein said toxin has a 

3 characteristic selected from the group consisting of: 

4 (a) said toxin has at least about 75% homology to a toxin from a Bacillus 

5 thuringiensis isolate selected from the group consisting of PS185U2, PS1 IB, 

6 PS218G2, PS213E5, PS86W1, PS28C, PS86BB1, PS89J3, PS86V1, PS94R1, 

7 HD525, HD573A, PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, 

8 andPS31Gl; 
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9 (b) said toxin comprises an amino acid sequence which is encoded by a nucleotide 

10 which hybridizes with a nucleotide sequence which encodes an amino acid 

1 J sequence selected from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, 



SEQ ID NO. 11, SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 17, SEQ ID 
NO. 19, SEQ ID NO. 21, SEQ ID NO. 23, SEQ ID NO. 25, SEQ ID NO. 27, 
SEQ ID NO. 29, SEQ ID NO. 31, SEQ ID NO. 33, SEQ ID NO. 35, SEQ ID 
NO."37,"SEQ ID NO. 39, SEQ ID NO. 41, SEQ ID NO. 43, SEQ ID NO. 45, 
SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1 , SEQ ID NO. 53, SEQ ID 
NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 61; 



18 ( c ) said toxin comprises an amino acid sequence having at least about 75% 

1 9 homology with a sequence selected from the group consisting of SEQ ID NO. 

20 7, SEQ ID NO. 9, SEQ ID NO. 1 1 , SEQ ID NO. 1 3, SEQ ID NO, 1 5, SEQ ID 

21 NO. 1 7, SEQ ID NO. 1 9, SEQ ID NO. 21, SEQ ID NO. 23 , SEQ ID NO. 25 , 

22 SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1 , SEQ ID NO. 33, SEQ ID 

23 NO. 35, SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID NO. 41, SEQ ID NO. 43, 

24 SEQ ID NO. 45, SEQ ID NO. 47, SEQ ID NO. 49, SEQ ID NO. 5 1 , SEQ ID 

25 NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, and SEQ ID NO. 

26 61; and 

27 (d) said toxin immunoreacts with an antibody to a toxin from a Bacillus 

28 thuringiensis isolate selected from the group consisting of PS185U2, PS1 IB, 

29 PS218G2, PS213E5, PS86WI, PS28C, PS86BBI, PS89J3, PS86V1, PS94R1 , 

30 HD525, HD573A, PS27J2, HD1 10, HD10, PS202S, HD29, PS101DD, HD129, 

31 andPS31Gl. 



1 26. The method, according to claim 25, wherein said toxin has at least about 90% 

2 homology to a toxin from a Bacillus thurmgiensis isolate selected from the group consisting of 

3 PS185U2,PS11B,PS218G2,PS213E5,PS86W1,PS28C,PS86BB1,PS89J3,PS86V1,PS94R1 I 

4 HD525, HD573A, PS27J2, HD110, HD10, PS202S, HD29, PS101DD, HD129, and PS31G1 

1 27. The method, according to claim 25, wherein said toxin is encoded by a nucleotide 

2 which hybridizes with a nucleotide sequence which encodes an amino acid sequence selected 

3 from the group consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, SEQ ED NO. 13, 

4 SEQ ID NO. 1 5, SEQ ID NO. 1 7, SEQ ID NO. 1 9, SEQ ID NO. 2 1 , SEQ ID NO. 23, SEQ ID 
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5 NO. 25, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 31, SEQ ID NO. 33, SEQ ID NO. 35, 

6 SEQ ID NO. 37, SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43, SEQ ID NO. 45, SEQ ID 

7 NO. 47, SEQ ID NO. 49, SEQ ID NO. 51 , SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, 

8 SEQ ID NO. 59, and SEQ ID NO. 61 

1 28. The method, according to claim 25, wherein said toxin comprises an amino acid 

2 sequence encoded by a nucleotide sequence which hybridizes with a sequence selected from the 

3 group consisting of SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID 

4 NO. 16. SEQ ID NO. 18, SEQ ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 24, SEQ ID NO. 26, 

5 SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 32, SEQ ID NO. 34, SEQ ID NO. 36, SEQ ID 

6 NO. 38, SEQ ID NO. 40, SEQ ID NO. 42, SEQ ID NO. 44, SEQ ID NO. 46, SEQ ID NO. 48, 

7 SEQ ID NO. 50, SEQ ID NO. 52, SEQ ID NO. 54, SEQ ID NO. 56, SEQ ID NO. 58, SEQ ID 

8 NO. 60, and SEQ ID NO. 62. 

1 29. The method, according to claim 25, wherein said toxin comprises an amino acid 

2 sequence having at least about 75% homology with a sequence selected from the group 

3 consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, SEQ ID NO. 13, SEQ ID NO. 15, 

4 SEQ ID NO. 1 7, SEQ ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 23, SEQ ID NO. 25, SEQ ID 

5 NO. 27, SEQ ID NO. 29, SEQ ID NO. 31 , SEQ ID NO. 33, SEQ ID NO. 35, SEQ ID NO. 37, 

6 SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43. SEQ ID NO. 45, SEQ ID NO. 47, SEQ ID 

7 NO. 49, SEQ ID NO. 5 1, SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, 

8 and SEQ ID NO. 61. 

1 30. The method, according to claim 29, wherein said toxin comprises an amino acid 

2 sequence having at least about 90% homology with a sequence selected from the group 

3 consisting of SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, SEQ ID NO. 13, SEQ ID NO. 15, 

4 SEQ ID NO. 1 7, SEQ ID NO. 19, SEQ ID NO. 21 , SEQ ID NO. 23, SEQ ID NO. 25, SEQ ID 

5 NO. 27, SEQ ID NO. 29, SEQ ID NO. 3 1, SEQ ID NO. 33, SEQ ID NO. 35, SEQ ID NO. 37, 

6 SEQ ID NO. 39, SEQ ID NO. 41 , SEQ ID NO. 43, SEQ ID NO. 45, SEQ ID NO. 47, SEQ ID 

7 NO. 49, SEQ ID NO. 51, SEQ ID NO. 53, SEQ ID NO. 55, SEQ ID NO. 57, SEQ ID NO. 59, 

8 and SEQ ID NO. 61. 



1 



3 1 . The method, according to claim 25, wherein said pest is a lepidopteran 
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1 32. A method for identifying polynucleotide sequences which encode toxins with 

2 activity against pests, wherein said method comprises preparing target Bacillus thuringiensis 

3 DNA for PCR amplification or for hybridization with a polynucleotide probe and then either 

4 (a) determining whether the target DNA hybridizes with a polynucleotide sequence 

5 selected from the group consisting of SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID 

6 NO. 12, SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 18, SEQ ID NO. 20, 

7 SEQ ID NO. 22, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID-NO. 28, SEQ ID 

8 NO. 30, SEQ ID NO. 35TSEQ ID NO. 34, SEQ ID NO. 36, SEQ ID NO. 38, 

9 SEQ ID NO. 40, SEQ ID NO. 42, SEQ ID NO. 44, SEQ ID NO. 46, SEQ ID 

10 NO. 48, SEQ ID NO. 50, SEQ ID NO. 52, SEQ ID NO. 54, SEQ ID NO. 56, 
] i SEQ ID NO. 58, SEQ ID NO. 60, and SEQ ID NO. 62; or 

1 2 (b) subjecting said target DNA to PCR with a primer pair selected from the group 

13 consisting of primer pair 1 (SEQ ID NOS. 1 and 2); primer pair 2 (SEQ ID 

14 NOS 3 and 4); and pnmer pair 3 (SEQ ID NOS 5 and 6). 

1 33. A polynucleotide sequence encoding a lepidopteran-active toxin wherein said 

2 polynucleotide sequence is identified by the process of claim 32. 

| 34. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin which is active against a non-mammalian pest wherein said toxin comprises an 

3 amino acid sequence having at least about 75% homology with a sequence selected from the 

4 group consisting of SEQ ID NO. 70, SEQ ID NO. 72, and SEQ ID NO. 74. 

1 35. The isolated polynucleotide, according to claim 34, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence having at least about 90% 

3 homology with a sequence selected from the group consisting of SEQ ID NO. 70, SEQ ID NO. 

4 72, and SEQ ID NO. 74. 

1 36. The isolated polynucleotide, according to claim 34, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence selected from the group 

3 consisting of SEQ ID NO. 70, SEQ ID NO. 72, and SEQ ID NO. 74. 



1 

2 



37. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 
part of a toxin which is active against a non-mammalian pest wherein said nucleotide sequence 
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3 encodes a toxin which comprises an amino acid sequence having at least about 75% homology 

4 with SEQ ID NO. 76. 

1 38. The isolated polynucleotide, according to claim 37, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence having at least about 90% 

3 homology with SEQ ID NO. 76. 

1 39. The isolated polynucleotide, according to claim 37, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence having the sequence shown 

3 in SEQ ID NO. 76. 

1 40. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin which is active against a non-mammalian pest wherein said toxin comprises an 

3 amino acid sequence having at least about 75% homology with a sequence selected from the 

4 group consisting of SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ 

5 ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 

6 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 1 02, and SEQ ID NO. 104. 

1 41. The isolated polynucleotide, according to claim 40, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence having at least about 90% 

3 homology with a sequence selected from the group consisting of SEQ ID NO. 78, SEQ ID NO. 

4 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO- 88, SEQ ID NO. 90, SEQ 

5 ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID NO. 98, SEQ ID NO. 1 00, SEQ ID NO. 

6 102, and SEQ ID NO. 104. 

1 42. The isolated polynucleotide, according to claim 40, wherein said nucleotide 

2 sequence encodes a toxin which comprises an amino acid sequence selected from the group 

3 consisting of SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 

4 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ 

5 ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104. 



1 

2 
3 



43. An isolated polynucleotide compnsing a nucleotide sequence which encodes all or 
part of a toxin active against a non-mammalian pest wherein said toxin immunoreacts with an 
antibody to a toxin selected from the group consisting of SEQ ID NO. 70, SEQ ID NO. 72, SEQ 



WO 98/00546 



PCTAJS97/11658 



' 180 

4 ID NO. 74, SEQ ID NO. 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 

5 84, SEQ ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ 

6 ID NO. 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104. 

1 44. An isolated polynucleotide comprising a nucleotide sequence which encodes all or 

2 part of a toxin active against a non-mammalian pest, wherein said nucleotide sequence 

3 hybridizes with a nucleotide sequence„which encodes an amino acid sequence selected from the 

4 group consisting of SEQlD NO. 70, SEQ ID NO. 72, SEQJD NO. 74, SEQ ID NO. 76, SEQ 

5 ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO. 

6 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID NO. 98, SEQ 

7 ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104. 

1 45. A purified or recombinant toxin having activity against a non-mammalian pest, 

2 wherein said toxin comprises an amino acid sequence having at least about 75% homology with 

3 a sequence selected from the group consisting of SEQ ED NO. 70, SEQ ID NO. 72, and SEQ ID 

4 NO. 74. 

1 46. The toxin, according to claim 43, wherein said toxin comprises an amino acid 

2 sequence having at least about 90% homology with a sequence selected from the group 

3 consisting of SEQ ID NO. 70, SEQ ID NO. 72, and SEQ ID NO. 74. 

1 47. The toxin, according to claim 43, wherein said toxin comprises an amino acid 

2 sequence selected from the group consisting of SEQ ID NO. 70, SEQ ID NO. 72, and SEQ ID 

3 NO. 7. 

1 48. A purified or recombinant toxin having activity against a non-mammalian pest, 

2 wherein said toxin comprises an ammo acid sequence having at least about 75% homology with 

3 the sequence shown in SEQ ID NO. 76. 

1 49. The toxin, according to claim 46, wherein said toxin comprises an amino acid 

2 sequence having at least about 90% homology with the sequence shown in SEQ ID NO. 76. 



1 

2 



50. The toxin, according to claim 46, wherein said toxin comprises an amino acid 
sequence shown in SEQ ID NO. 76. 
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1 51. A purified or recombinant toxin having activity against a non-mammalian pest, 

2 wherein said toxin comprises an amino acid sequence having at least about 75% homology with 

3 an amino acid sequence selected from the group consisting ofSEQ ID NO. 78, SEQ ID NO. 80\ 

4 SEQ ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID 

5 NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID NO. 98, SEQ ED NO. 100, SEQ ED NO. 102, 

6 and SEQ ID NO. 104. 

1 52. The toxin, according to claim 49, wherein said toxin comprises an amino acid 

2 sequence having at least about 90% homology with an amino acid sequence selected from the 

3 group consisting of SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ 

4 ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 

5 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104. 

1 53. The toxin, according to claim 49, wherein said toxin comprises an amino acid 

2 sequence selected from the group consisting of SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 

3 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ 

4 ID NO. 94, SEQ ED NO. 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID 

5 NO. 104. 

1 54. A recombinant host transformed with a polynucleotide comprising a nucleotide 

2 sequence which encodes all or part of a toxin which is active against a non-mammalian pest 

3 wherein said toxin comprises an amino acid sequence having at least about 75% homology with 

4 a sequence selected from the group consisting of SEQ ID NO. 70, SEQ ID NO. 72, SEQ ID NO. 

5 74, SEQ ID NO. 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ 

6 ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 

7 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104. 

1 55. The recombinant host, according to claim 54, wherein said transformed host is a 

2 bacterium. 



1 
2 



56. The recombinant host, according to claim 55, wherein said bacterium is selected 
from the group consisting of E. coli> Pseudomonas , and Bacillus thuringiensis. 
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1 57. The recombinant host, according to claim 54, wherein said transformed host is a 

2 plant. 

1 58. A method for the control of a non-mammalian pest which comprises contacting said 

2 pest with a pesticidai amount of a Bacillus thuringiensis toxin wherein said toxin has a 

3 characteristic selected from the group consisting of: 

4 (a) said toxin comprises an amino acid sequence having at. least about 75% 

5 homology with a sequence selected from the group consisting of SEQ ID NO. 

6 70, SEQ ID NO. 72, SEQ ID NO. 74, SEQ ID NO. 76, SEQ ID NO. 78, SEQ 

7 ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ED NO. 88, 

8 SEQ ID NO. 90, SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ 3D 

9 NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104; 

] o (b) said toxin comprises an amino acid sequence which is encoded by a nucleotide 

1 1 which hybridizes with a nucleotide sequence which encodes an ammo acid 

1 2 sequence selected from the group consisting of SEQ ID NO. 70, SEQ ID NO. 

13 72, SEQ ID NO. 74, SEQ ID NO. 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ 

14 ID NO. 82, SEQ ID NO. 84, SEQ ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, 

1 5 SEQ ID NO. 92, SEQ ID NO. 94, SEQ ID NO. 96, SEQ ID NO. 98, SEQ ID 

16 NO. 100, SEQ ID NO. 102, and SEQ ID NO. 104; and 

17 (c) said toxin immunoreacts with an antibody to a toxin selected from the group 

18 consisting of SEQ ID NO. 70, SEQ ID NO. 72, SEQ ID NO. 74, SEQ ID NO. 

19 76, SEQ ID NO. 78, SEQ ID NO. 80, SEQ ID NO. 82, SEQ ID NO. 84, SEQ 

20 ID NO. 86, SEQ ID NO. 88, SEQ ID NO. 90, SEQ ID NO. 92, SEQ ED NO. 94, 

21 SEQ ID NO. 96, SEQ ID NO. 98, SEQ ID NO. 100, SEQ ID NO. 102, and SEQ 

22 ID NO. 104. 

1 59. The method, according to claim 58, wherein said pest is a lepidopteran. 



1 60. The method, according to claim 59, wherein said pest is selected from the group 

2 consisting of Agrotis ipsilon, Heliothis virescens, and Helicoverpa zea. 
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